Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
3
0
230
Aug ’25
Can I Fade Out Track Volume Before End Using ApplicationMusicPlayer?
I’m building a music app using Apple Music streaming via ApplicationMusicPlayer. My goal is to decrease the volume of the current song during the last 10 seconds, and when the next track begins, restore the volume to its normal level. I know that ApplicationMusicPlayer doesn’t expose a volume API, and I want to avoid triggering the system volume HUD. ✅ Using Apple Music streaming (not local files) ❓ Is it possible to implement per-track fade-out/fade-in logic with ApplicationMusicPlayer? Appreciate any clarification or official guidance!
0
0
129
Jun ’25
HDR video & screen brightness
When I play an HDR video in the iPhone Photos app, I can see the HDR effect obviously. But if this HDR video is played continuously for more than 30-40 minutes, the HDR effect will disappear and the brightness will be compressed to the SDR range. This issue will appear on any iPhone. Depending on the phone, it may be 20-30 minutes, or 30-40 minutes, or even a few minutes, such as iPhone 12 mini. Similarly, if I use AVPlayer to play and preview an HDR video, if it plays more than 30-40 minutes, the HDR effect will disappear and the screen brightness will dim. Also the currentEDRHeadroom will gradually decrease to 1 Note, test it with an HDR video longer than 1 hour, and if the video is short, please loop it. My question is how to avoid losing the HDR effect after 30-40 minutes when I use CAMetalLayer to render any HDR video.
1
0
227
Jul ’25
AVAssetResourceLoaderDelegate for radio stream
Hi everyone, I’m trying to use AVAssetResourceLoaderDelegate to handle a live radio stream (e.g. Icecast/HTTP stream). My goal is to have access to the last 30 seconds of audio data during playback, so I can analyze it for specific audio patterns in near-real-time. I’ve implemented a custom resource loader that works fine for podcasts and static files, where the file size and content length are known. However, for infinite live streams, my current implementation stops receiving new loading requests after the first one is served. As a result, the playback either stalls or fails to continue. Has anyone successfully used AVAssetResourceLoaderDelegate with a continuous radio stream? Or maybe you can suggest betterapproach for buffering and analyzing live audio? Any tips, examples, or advice would be appreciated. Thanks!
0
0
181
Jun ’25
Is there a way to get lossless music playback on macOS?
I noticed that while playing back the same tracks via MusicKit on different OSes I get different results regarding the audio files being streamed. Playing back a lossless file with 24Bit 48kHz and watching the Console for RemotePlayerService I get: on iPadOS: Lossless; groupID: audio-alac-stereo-48000-24; bitDepth: 24-bit; sampleRate: 48khz; codec: alac; channels: 2; layout: Stereo; on macOS: Creating AudioQueue with format:'paac', framesPerPacket:1024, sampleRate:44100 While the iPad looks perfect, the Mac does not. Is there a way to fix this issue on macOS. BTW: I switched the Audio-Midi Settings before, after and while the macOS App was lunched. I also switched to different output devices. I wasn't able to change the bad audio-output on the mac. I tested this under Sequoia 15.5 and Tahoe beta 1, Xcode 16.4 and 26 beta 1. The AudioVariants of the Album/Tracks are .dolbyAtmos, .lossless, .lossyStereo Apple Music displays Lossless 24 Bit/48 kHz ALAC when clicking on the playercontroll icon on macOS I hope there are only some missing or misconfigured properties to get macOS up to par. Thanks :-)
0
1
189
Jun ’25
Execution breakpoint when trying to play a music library file with AVAudioEngine
Hi all, I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect. After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above. After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure. Here is the setupAudioEngine function: private func setupAudioEngine() { do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default) try AVAudioSession.sharedInstance().setActive(true) } catch { print("Audio session error: \(error)") } engine.attach(playerNode) engine.attach(analyzer) engine.connect(playerNode, to: analyzer, format: nil) engine.connect(analyzer, to: engine.mainMixerNode, format: nil) analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } } Here is the play function: func play(_ mediaItem: MPMediaItem) { guard let assetURL = mediaItem.assetURL else { print("No asset URL for media item") return } stop() do { audioFile = try AVAudioFile(forReading: assetURL) guard let audioFile else { print("Failed to create audio file") return } duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate if !engine.isRunning { try engine.start() } playerNode.scheduleFile(audioFile, at: nil) playerNode.play() DispatchQueue.main.async { [weak self] in self?.isPlaying = true self?.startDisplayLink() } } catch { print("Error playing audio: \(error)") DispatchQueue.main.async { [weak self] in self?.isPlaying = false self?.stopDisplayLink() } } } Here is a link to my test project if you want to try it out for yourself: https://github.com/aabagdi/VisualMan-example Thanks!
8
0
711
Jul ’25
FxPlug SDK 4.3.2 causes dyld errors when loaded on versions of macOS prior to 14.6
FxPlug is one of Apple’s official SDKs, recently updated to version 4.3.2. In theory the SDK should guarantee third-parties can build plug-ins that are backward compatible with older versions of Final Cut Pro, Motion and Compressor. FxPlug SDK includes two frameworks that third-party developers like me end up bundling inside our third-party plugins: FxPlug.framework and PlugInManager.framework. Behind the scenes, the SDK relies on PlugInKit, but the FxPlug.framework provides abstractions so that third-parties don't have to handle the intricacies of XPC directly. The most recent version of FxPlug.framework included with the SDK was possibly built with an error: the Info.plist shows a LSMinimumSystemVersion entry of 14.6, suggesting the binary may have been compiled and linked with MACOSX_DEPLOYMENT_TARGET set to 14.6 by accident. The problem: when older versions of Final Cut Pro or Motion load a third-party plugin (itself built with the appropriate deployment target, macOS 11 or 12, for example) on pre-macOS 14.6, the dynamic linker immediately loads Apple’s own FxPlug.framework, but this causes the process to crash immediately: Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libobjc.A.dylib 0x7ff81e065955 map_images_nolock + 5399 1 libobjc.A.dylib 0x7ff81e0643d6 map_images + 67 2 dyld 0x10bd551fb invocation function for block in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 275 3 dyld 0x10bd506c9 dyld4::RuntimeState::withLoadersReadLock(void () block_pointer) + 41 4 dyld 0x10bd550e2 dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 82 5 dyld 0x10bd68d45 dyld4::APIs::_dyld_objc_notify_register(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 79 6 libobjc.A.dylib 0x7ff81e064244 _objc_init + 1279 7 libdispatch.dylib 0x7ff81e01d993 _os_object_init + 13 8 libdispatch.dylib 0x7ff81e02b1b8 libdispatch_init + 311 9 libSystem.B.dylib 0x7ff828fd585f libSystem_initializer + 238 10 dyld 0x10bd5ae4f invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 182 11 dyld 0x10bd81aad invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 242 12 dyld 0x10bd78e26 invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 557 13 dyld 0x10bd47db3 dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 129 14 dyld 0x10bd78bb7 dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 179 15 dyld 0x10bd81604 dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 466 16 dyld 0x10bd5ad82 dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 144 17 dyld 0x10bd6165a dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const + 30 18 dyld 0x10bd6e76e dyld4::APIs::runAllInitializersForMain() + 38 19 dyld 0x10bd4c38d dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3443 20 dyld 0x10bd4b4e4 start + 388 Can someone at Apple with the right domain expertise confirm that this is the type of crash you would see because the framework was built assuming it would run on macOS 14.6 and later, and when facing an older environment (e.g. ObjC runtime) it lacks extra code that would ensure backward compatibility with the earlier ObjC runtime found on macOS 12.x?
2
0
292
Jun ’25
Couldn't able to hear audio via speaker on ios real device
This is my native module code implementation I'm getting base64 encoded string from server and passing this to my native module of pcm player to play audio App.tsx PcmPlayer.writeChunk(e.data); PcmPlayer.swift import AVFoundation @objc(PcmPlayer) class PcmPlayer: RCTEventEmitter { private var engine: AVAudioEngine? private var playerNode: AVAudioPlayerNode? private var format: AVAudioFormat? private var bufferQueue = [Data]() private var isPlaying = false private var hasEnded = false private var scheduledBufferCount = 0 private let minBufferBytes = 50000 private let pcmQueue = DispatchQueue(label: "pcm.queue") override init() { super.init() } override func supportedEvents() -> [String]! { return ["onStatus", "onMessage"] } @objc(initPlayer:channels:bitsPerSample:) func initPlayer(_ sampleRate: NSNumber, channels: NSNumber, bitsPerSample: NSNumber) { pcmQueue.async { self.stopInternal() let session = AVAudioSession.sharedInstance() do { try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true, options: .notifyOthersOnDeactivation) try session.setMode(.default) print("🔈 Audio session active. Output route:", session.currentRoute.outputs) } catch { print("❌ Audio session setup failed:", error) return } self.engine = AVAudioEngine() self.playerNode = AVAudioPlayerNode() guard let engine = self.engine, let playerNode = self.playerNode else { print("❌ Engine or playerNode is nil") return } engine.attach(playerNode) self.format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: sampleRate.doubleValue, channels: AVAudioChannelCount(channels.uintValue), interleaved: false) guard let format = self.format else { print("❌ Failed to create AVAudioFormat") return } engine.connect(playerNode, to: engine.mainMixerNode, format: format) do { try engine.start() playerNode.play() engine.mainMixerNode.outputVolume = 1.0 print("✅ AVAudioEngine started with format:", format) } catch { print("❌ Engine start failed:", error) } self.hasEnded = false } } @objc(writeChunk:) func writeChunk(_ base64Pcm: String) { pcmQueue.async { guard base64Pcm.count >= 10 else { print("⚠️ Skipping short base64 string") return } var padded = base64Pcm let mod4 = base64Pcm.count % 4 if mod4 > 0 { padded += String(repeating: "=", count: 4 - mod4) } guard let data = Data(base64Encoded: padded, options: .ignoreUnknownCharacters) else { print("❌ Failed to decode base64") return } self.bufferQueue.append(data) print("📥 Received PCM chunk (\(data.count) bytes)") print("📥 writeChunk called. isPlaying=\(self.isPlaying), bufferQueue.count=\(self.bufferQueue.count)") if !self.isPlaying { self.isPlaying = true self.waitForBufferAndStartPlayback() } else if self.scheduledBufferCount == 0 { self.isPlaying = true self.waitForBufferAndStartPlayback() } } } private func waitForBufferAndStartPlayback() { DispatchQueue.global().async { while self.queueSize() < self.minBufferBytes && !self.hasEnded { Thread.sleep(forTimeInterval: 0.01) } self.writeLoop() } } private func writeLoop() { DispatchQueue.global().async { writeLoop: while self.isPlaying { if self.bufferQueue.isEmpty { for _ in 0..<100 { Thread.sleep(forTimeInterval: 0.01) if !self.bufferQueue.isEmpty { break } } if self.bufferQueue.isEmpty { print("🔇 No more data to play after waiting") self.isPlaying = false break writeLoop } } var data: Data? self.pcmQueue.sync { if !self.bufferQueue.isEmpty { data = self.bufferQueue.removeFirst() } } guard let chunk = data else { print("⚠️ No data to process") continue } if let buffer = self.pcmBufferFromData(chunk) { self.scheduledBufferCount += 1 self.playerNode?.scheduleBuffer(buffer, completionHandler: { self.pcmQueue.async { self.scheduledBufferCount -= 1 if self.bufferQueue.isEmpty && self.scheduledBufferCount == 0 { print("ℹ️ Playback idle - waiting for more data") self.isPlaying = false } } }) } } } } private func pcmBufferFromData(_ data: Data) -> AVAudioPCMBuffer? { guard let format = self.format else { return nil } let frameCount = UInt32(data.count / 2) guard let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount) else { print("❌ Failed to create AVAudioPCMBuffer") return nil } buffer.frameLength = frameCount guard let floatChannelData = buffer.floatChannelData?[0] else { print("❌ floatChannelData is nil") return nil } data.withUnsafeBytes { (rawBuffer: UnsafeRawBufferPointer) in let int16Buffer = rawBuffer.bindMemory(to: Int16.self) let count = min(int16Buffer.count, Int(frameCount)) for i in 0..<count { floatChannelData[i] = Float32(int16Buffer[i]) / Float32(Int16.max) } } return buffer } @objc(stopPlayer) func stopPlayer() { pcmQueue.async { self.stopInternal() } } private func stopInternal() { print("🛑 stopInternal called") self.playerNode?.stop() self.engine?.stop() self.engine?.reset() self.playerNode = nil self.engine = nil self.format = nil self.bufferQueue.removeAll() self.isPlaying = false self.hasEnded = true self.scheduledBufferCount = 0 } @objc(canWrite:rejecter:) func canWrite(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { resolve(self.bufferQueue.count < 20) } } @objc(flushPlayer:rejecter:) func flushPlayer(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { self.bufferQueue.removeAll() resolve(nil) } } @objc static override func requiresMainQueueSetup() -> Bool { return false } private func queueSize() -> Int { return pcmQueue.sync { return self.bufferQueue.reduce(0) { $0 + $1.count } } } } I couldn't able to hear any audio via my real iOS device also it is working fine on emulator.
0
0
239
Jul ’25
AirPlay v1 is broken in iOS 18.4?
After upgrading to iOS 18.4, I'm no longer able to establish an AirPlay v1 connection to an audio system. The symptom is that the AirPlay route picker just spins when trying to connect to an audio system. It eventually gives up. I tested this on an iPhone 14, connecting to a HomePod, AirPort express, AppleTV and a Wiim Pro. If I try connecting with AirPlay v2, ex: using Apple Music, the connection succeeds and audio can be played. I'm the developer of an app that plays audio over AirPlay while also recording. My app has to use AirPlay v1 because AvAudioSession doesn't allow the policy .longFormAudio when the category is .playAndRecord. This issue is a real pain as it means my app is suddenly broken for many thousands of users. Is anyone else seeing this issue? Any suggestions for a workaround?
2
3
657
Jun ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
0
0
586
Jul ’25
Apple Music API High Error Rate
I am getting high error rates from the Apple Music API. This has been happening for months now, and it is quite frustrating. It is a mix of 404, 504, and random 500 errors. I hit these endpoints all of the time, so it is not like I am hitting a resource that doesn't exist. Why is this happening? Is this a known issue that is getting worked on?
0
0
141
Jun ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
1
0
193
Jul ’25
MPNowPlayingInfoCenter nowPlayingInfo throttled
Hello, I have been running into issues with setting nowPlayingInfo information, specifically updating information for CarPlay and the CPNowPlayingTemplate. When I start playback for an item, I see lock screen information update as expected, along with the CarPlay now playing information. However, the playing items are books with collections of tracks. When I select a new track(chapter) within the book, I set the MPMediaItemPropertyTitle to the new chapter name. This change is reflected correctly on the lock screen, but almost never appears correctly on the CarPlay CPNowPlayingTemplate. The previous chapter title remains set and never updates. I see "Application exceeded audio metadata throttle limit." in the debug console fairly frequently. From that a I figured that I need to minimize updates to the nowPlayingInfo dictionary. What I did: I store the metadata dictionary in a local dictionary and only set values in the main nowPlayingInfo dictionary when they are different from the current value. I kick off the nowPlayingInfo update via a task that initially sleeps for around 2 seconds (not a final value, just for my current testing). If a previous Task is active, it gets cancelled, so that only one update can happen within that time window. Neither of these things have been sufficient. I can switch between different titles entirely and the information updates (including cover art). But when I switch chapters within a title, the MPMediaItemPropertyTitle continues to get dropped. I know the value is getting set, because it updates on the lock screen correctly. In total, I have 12 keys I update for info, though with the above changes, usually 2-4 of them actually get updated with high frequency. I am running out of ideas to satisfy the throttling thresholds to accurately display metadata. I could use some advice. Thanks.
4
1
255
May ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
532
Aug ’25
When saving a 48MP image, CGImageDestinationFinalize consumes a large amount of memory. Are there any optimization methods available?
it will use about 300MB memory, it cause a memory peak
Replies
1
Boosts
0
Views
518
Activity
Jul ’25
SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
Replies
3
Boosts
0
Views
230
Activity
Aug ’25
Can I Fade Out Track Volume Before End Using ApplicationMusicPlayer?
I’m building a music app using Apple Music streaming via ApplicationMusicPlayer. My goal is to decrease the volume of the current song during the last 10 seconds, and when the next track begins, restore the volume to its normal level. I know that ApplicationMusicPlayer doesn’t expose a volume API, and I want to avoid triggering the system volume HUD. ✅ Using Apple Music streaming (not local files) ❓ Is it possible to implement per-track fade-out/fade-in logic with ApplicationMusicPlayer? Appreciate any clarification or official guidance!
Replies
0
Boosts
0
Views
129
Activity
Jun ’25
HDR video & screen brightness
When I play an HDR video in the iPhone Photos app, I can see the HDR effect obviously. But if this HDR video is played continuously for more than 30-40 minutes, the HDR effect will disappear and the brightness will be compressed to the SDR range. This issue will appear on any iPhone. Depending on the phone, it may be 20-30 minutes, or 30-40 minutes, or even a few minutes, such as iPhone 12 mini. Similarly, if I use AVPlayer to play and preview an HDR video, if it plays more than 30-40 minutes, the HDR effect will disappear and the screen brightness will dim. Also the currentEDRHeadroom will gradually decrease to 1 Note, test it with an HDR video longer than 1 hour, and if the video is short, please loop it. My question is how to avoid losing the HDR effect after 30-40 minutes when I use CAMetalLayer to render any HDR video.
Replies
1
Boosts
0
Views
227
Activity
Jul ’25
AVAssetResourceLoaderDelegate for radio stream
Hi everyone, I’m trying to use AVAssetResourceLoaderDelegate to handle a live radio stream (e.g. Icecast/HTTP stream). My goal is to have access to the last 30 seconds of audio data during playback, so I can analyze it for specific audio patterns in near-real-time. I’ve implemented a custom resource loader that works fine for podcasts and static files, where the file size and content length are known. However, for infinite live streams, my current implementation stops receiving new loading requests after the first one is served. As a result, the playback either stalls or fails to continue. Has anyone successfully used AVAssetResourceLoaderDelegate with a continuous radio stream? Or maybe you can suggest betterapproach for buffering and analyzing live audio? Any tips, examples, or advice would be appreciated. Thanks!
Replies
0
Boosts
0
Views
181
Activity
Jun ’25
Is there a way to get lossless music playback on macOS?
I noticed that while playing back the same tracks via MusicKit on different OSes I get different results regarding the audio files being streamed. Playing back a lossless file with 24Bit 48kHz and watching the Console for RemotePlayerService I get: on iPadOS: Lossless; groupID: audio-alac-stereo-48000-24; bitDepth: 24-bit; sampleRate: 48khz; codec: alac; channels: 2; layout: Stereo; on macOS: Creating AudioQueue with format:'paac', framesPerPacket:1024, sampleRate:44100 While the iPad looks perfect, the Mac does not. Is there a way to fix this issue on macOS. BTW: I switched the Audio-Midi Settings before, after and while the macOS App was lunched. I also switched to different output devices. I wasn't able to change the bad audio-output on the mac. I tested this under Sequoia 15.5 and Tahoe beta 1, Xcode 16.4 and 26 beta 1. The AudioVariants of the Album/Tracks are .dolbyAtmos, .lossless, .lossyStereo Apple Music displays Lossless 24 Bit/48 kHz ALAC when clicking on the playercontroll icon on macOS I hope there are only some missing or misconfigured properties to get macOS up to par. Thanks :-)
Replies
0
Boosts
1
Views
189
Activity
Jun ’25
Execution breakpoint when trying to play a music library file with AVAudioEngine
Hi all, I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect. After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above. After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure. Here is the setupAudioEngine function: private func setupAudioEngine() { do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default) try AVAudioSession.sharedInstance().setActive(true) } catch { print("Audio session error: \(error)") } engine.attach(playerNode) engine.attach(analyzer) engine.connect(playerNode, to: analyzer, format: nil) engine.connect(analyzer, to: engine.mainMixerNode, format: nil) analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } } Here is the play function: func play(_ mediaItem: MPMediaItem) { guard let assetURL = mediaItem.assetURL else { print("No asset URL for media item") return } stop() do { audioFile = try AVAudioFile(forReading: assetURL) guard let audioFile else { print("Failed to create audio file") return } duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate if !engine.isRunning { try engine.start() } playerNode.scheduleFile(audioFile, at: nil) playerNode.play() DispatchQueue.main.async { [weak self] in self?.isPlaying = true self?.startDisplayLink() } } catch { print("Error playing audio: \(error)") DispatchQueue.main.async { [weak self] in self?.isPlaying = false self?.stopDisplayLink() } } } Here is a link to my test project if you want to try it out for yourself: https://github.com/aabagdi/VisualMan-example Thanks!
Replies
8
Boosts
0
Views
711
Activity
Jul ’25
FxPlug SDK 4.3.2 causes dyld errors when loaded on versions of macOS prior to 14.6
FxPlug is one of Apple’s official SDKs, recently updated to version 4.3.2. In theory the SDK should guarantee third-parties can build plug-ins that are backward compatible with older versions of Final Cut Pro, Motion and Compressor. FxPlug SDK includes two frameworks that third-party developers like me end up bundling inside our third-party plugins: FxPlug.framework and PlugInManager.framework. Behind the scenes, the SDK relies on PlugInKit, but the FxPlug.framework provides abstractions so that third-parties don't have to handle the intricacies of XPC directly. The most recent version of FxPlug.framework included with the SDK was possibly built with an error: the Info.plist shows a LSMinimumSystemVersion entry of 14.6, suggesting the binary may have been compiled and linked with MACOSX_DEPLOYMENT_TARGET set to 14.6 by accident. The problem: when older versions of Final Cut Pro or Motion load a third-party plugin (itself built with the appropriate deployment target, macOS 11 or 12, for example) on pre-macOS 14.6, the dynamic linker immediately loads Apple’s own FxPlug.framework, but this causes the process to crash immediately: Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libobjc.A.dylib 0x7ff81e065955 map_images_nolock + 5399 1 libobjc.A.dylib 0x7ff81e0643d6 map_images + 67 2 dyld 0x10bd551fb invocation function for block in dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 275 3 dyld 0x10bd506c9 dyld4::RuntimeState::withLoadersReadLock(void () block_pointer) + 41 4 dyld 0x10bd550e2 dyld4::RuntimeState::setObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 82 5 dyld 0x10bd68d45 dyld4::APIs::_dyld_objc_notify_register(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) + 79 6 libobjc.A.dylib 0x7ff81e064244 _objc_init + 1279 7 libdispatch.dylib 0x7ff81e01d993 _os_object_init + 13 8 libdispatch.dylib 0x7ff81e02b1b8 libdispatch_init + 311 9 libSystem.B.dylib 0x7ff828fd585f libSystem_initializer + 238 10 dyld 0x10bd5ae4f invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 182 11 dyld 0x10bd81aad invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 242 12 dyld 0x10bd78e26 invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 557 13 dyld 0x10bd47db3 dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 129 14 dyld 0x10bd78bb7 dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 179 15 dyld 0x10bd81604 dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 466 16 dyld 0x10bd5ad82 dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 144 17 dyld 0x10bd6165a dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const + 30 18 dyld 0x10bd6e76e dyld4::APIs::runAllInitializersForMain() + 38 19 dyld 0x10bd4c38d dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3443 20 dyld 0x10bd4b4e4 start + 388 Can someone at Apple with the right domain expertise confirm that this is the type of crash you would see because the framework was built assuming it would run on macOS 14.6 and later, and when facing an older environment (e.g. ObjC runtime) it lacks extra code that would ensure backward compatibility with the earlier ObjC runtime found on macOS 12.x?
Replies
2
Boosts
0
Views
292
Activity
Jun ’25
How to toggle usb device
Can i use iokit usb lib to disable build-in camera?
Replies
4
Boosts
0
Views
292
Activity
Jun ’25
Regarding the issue of obtaining input channels for aggregated devices
I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
Replies
0
Boosts
0
Views
525
Activity
Jun ’25
Couldn't able to hear audio via speaker on ios real device
This is my native module code implementation I'm getting base64 encoded string from server and passing this to my native module of pcm player to play audio App.tsx PcmPlayer.writeChunk(e.data); PcmPlayer.swift import AVFoundation @objc(PcmPlayer) class PcmPlayer: RCTEventEmitter { private var engine: AVAudioEngine? private var playerNode: AVAudioPlayerNode? private var format: AVAudioFormat? private var bufferQueue = [Data]() private var isPlaying = false private var hasEnded = false private var scheduledBufferCount = 0 private let minBufferBytes = 50000 private let pcmQueue = DispatchQueue(label: "pcm.queue") override init() { super.init() } override func supportedEvents() -> [String]! { return ["onStatus", "onMessage"] } @objc(initPlayer:channels:bitsPerSample:) func initPlayer(_ sampleRate: NSNumber, channels: NSNumber, bitsPerSample: NSNumber) { pcmQueue.async { self.stopInternal() let session = AVAudioSession.sharedInstance() do { try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true, options: .notifyOthersOnDeactivation) try session.setMode(.default) print("🔈 Audio session active. Output route:", session.currentRoute.outputs) } catch { print("❌ Audio session setup failed:", error) return } self.engine = AVAudioEngine() self.playerNode = AVAudioPlayerNode() guard let engine = self.engine, let playerNode = self.playerNode else { print("❌ Engine or playerNode is nil") return } engine.attach(playerNode) self.format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: sampleRate.doubleValue, channels: AVAudioChannelCount(channels.uintValue), interleaved: false) guard let format = self.format else { print("❌ Failed to create AVAudioFormat") return } engine.connect(playerNode, to: engine.mainMixerNode, format: format) do { try engine.start() playerNode.play() engine.mainMixerNode.outputVolume = 1.0 print("✅ AVAudioEngine started with format:", format) } catch { print("❌ Engine start failed:", error) } self.hasEnded = false } } @objc(writeChunk:) func writeChunk(_ base64Pcm: String) { pcmQueue.async { guard base64Pcm.count >= 10 else { print("⚠️ Skipping short base64 string") return } var padded = base64Pcm let mod4 = base64Pcm.count % 4 if mod4 > 0 { padded += String(repeating: "=", count: 4 - mod4) } guard let data = Data(base64Encoded: padded, options: .ignoreUnknownCharacters) else { print("❌ Failed to decode base64") return } self.bufferQueue.append(data) print("📥 Received PCM chunk (\(data.count) bytes)") print("📥 writeChunk called. isPlaying=\(self.isPlaying), bufferQueue.count=\(self.bufferQueue.count)") if !self.isPlaying { self.isPlaying = true self.waitForBufferAndStartPlayback() } else if self.scheduledBufferCount == 0 { self.isPlaying = true self.waitForBufferAndStartPlayback() } } } private func waitForBufferAndStartPlayback() { DispatchQueue.global().async { while self.queueSize() < self.minBufferBytes && !self.hasEnded { Thread.sleep(forTimeInterval: 0.01) } self.writeLoop() } } private func writeLoop() { DispatchQueue.global().async { writeLoop: while self.isPlaying { if self.bufferQueue.isEmpty { for _ in 0..<100 { Thread.sleep(forTimeInterval: 0.01) if !self.bufferQueue.isEmpty { break } } if self.bufferQueue.isEmpty { print("🔇 No more data to play after waiting") self.isPlaying = false break writeLoop } } var data: Data? self.pcmQueue.sync { if !self.bufferQueue.isEmpty { data = self.bufferQueue.removeFirst() } } guard let chunk = data else { print("⚠️ No data to process") continue } if let buffer = self.pcmBufferFromData(chunk) { self.scheduledBufferCount += 1 self.playerNode?.scheduleBuffer(buffer, completionHandler: { self.pcmQueue.async { self.scheduledBufferCount -= 1 if self.bufferQueue.isEmpty && self.scheduledBufferCount == 0 { print("ℹ️ Playback idle - waiting for more data") self.isPlaying = false } } }) } } } } private func pcmBufferFromData(_ data: Data) -> AVAudioPCMBuffer? { guard let format = self.format else { return nil } let frameCount = UInt32(data.count / 2) guard let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount) else { print("❌ Failed to create AVAudioPCMBuffer") return nil } buffer.frameLength = frameCount guard let floatChannelData = buffer.floatChannelData?[0] else { print("❌ floatChannelData is nil") return nil } data.withUnsafeBytes { (rawBuffer: UnsafeRawBufferPointer) in let int16Buffer = rawBuffer.bindMemory(to: Int16.self) let count = min(int16Buffer.count, Int(frameCount)) for i in 0..<count { floatChannelData[i] = Float32(int16Buffer[i]) / Float32(Int16.max) } } return buffer } @objc(stopPlayer) func stopPlayer() { pcmQueue.async { self.stopInternal() } } private func stopInternal() { print("🛑 stopInternal called") self.playerNode?.stop() self.engine?.stop() self.engine?.reset() self.playerNode = nil self.engine = nil self.format = nil self.bufferQueue.removeAll() self.isPlaying = false self.hasEnded = true self.scheduledBufferCount = 0 } @objc(canWrite:rejecter:) func canWrite(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { resolve(self.bufferQueue.count < 20) } } @objc(flushPlayer:rejecter:) func flushPlayer(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { self.bufferQueue.removeAll() resolve(nil) } } @objc static override func requiresMainQueueSetup() -> Bool { return false } private func queueSize() -> Int { return pcmQueue.sync { return self.bufferQueue.reduce(0) { $0 + $1.count } } } } I couldn't able to hear any audio via my real iOS device also it is working fine on emulator.
Replies
0
Boosts
0
Views
239
Activity
Jul ’25
AVCaptureMetadataOutput .face Not Triggering on iOS 26 – Regression or Intended Change?
Issue: In iOS 26 (tested on Developer Beta), AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when using .face detection. metadataOutput.metadataObjectTypes = [.face]
Replies
1
Boosts
7
Views
389
Activity
Jun ’25
AutoMix Api Available in MusicKit
Is there any way for me to use an AutoMix api in my IOS apps, I would play tracks using the Apple Music api and use AutoMix to attempt to merge tracks. Is this feature/api available to developers.
Replies
0
Boosts
0
Views
150
Activity
Jun ’25
React website working properly in Android but not in Iphone
We have a React website build to scan qr codes. The website is properly working for Android devices but for Iphone we see a camera glitch causing delay in scan which is unexpected. Website URL : https://react-qr-code-scanner-app.vercel.app/
Replies
0
Boosts
0
Views
58
Activity
Jul ’25
AirPlay v1 is broken in iOS 18.4?
After upgrading to iOS 18.4, I'm no longer able to establish an AirPlay v1 connection to an audio system. The symptom is that the AirPlay route picker just spins when trying to connect to an audio system. It eventually gives up. I tested this on an iPhone 14, connecting to a HomePod, AirPort express, AppleTV and a Wiim Pro. If I try connecting with AirPlay v2, ex: using Apple Music, the connection succeeds and audio can be played. I'm the developer of an app that plays audio over AirPlay while also recording. My app has to use AirPlay v1 because AvAudioSession doesn't allow the policy .longFormAudio when the category is .playAndRecord. This issue is a real pain as it means my app is suddenly broken for many thousands of users. Is anyone else seeing this issue? Any suggestions for a workaround?
Replies
2
Boosts
3
Views
657
Activity
Jun ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
Replies
0
Boosts
0
Views
586
Activity
Jul ’25
Apple Music API High Error Rate
I am getting high error rates from the Apple Music API. This has been happening for months now, and it is quite frustrating. It is a mix of 404, 504, and random 500 errors. I hit these endpoints all of the time, so it is not like I am hitting a resource that doesn't exist. Why is this happening? Is this a known issue that is getting worked on?
Replies
0
Boosts
0
Views
141
Activity
Jun ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
Replies
1
Boosts
0
Views
193
Activity
Jul ’25
MPNowPlayingInfoCenter nowPlayingInfo throttled
Hello, I have been running into issues with setting nowPlayingInfo information, specifically updating information for CarPlay and the CPNowPlayingTemplate. When I start playback for an item, I see lock screen information update as expected, along with the CarPlay now playing information. However, the playing items are books with collections of tracks. When I select a new track(chapter) within the book, I set the MPMediaItemPropertyTitle to the new chapter name. This change is reflected correctly on the lock screen, but almost never appears correctly on the CarPlay CPNowPlayingTemplate. The previous chapter title remains set and never updates. I see "Application exceeded audio metadata throttle limit." in the debug console fairly frequently. From that a I figured that I need to minimize updates to the nowPlayingInfo dictionary. What I did: I store the metadata dictionary in a local dictionary and only set values in the main nowPlayingInfo dictionary when they are different from the current value. I kick off the nowPlayingInfo update via a task that initially sleeps for around 2 seconds (not a final value, just for my current testing). If a previous Task is active, it gets cancelled, so that only one update can happen within that time window. Neither of these things have been sufficient. I can switch between different titles entirely and the information updates (including cover art). But when I switch chapters within a title, the MPMediaItemPropertyTitle continues to get dropped. I know the value is getting set, because it updates on the lock screen correctly. In total, I have 12 keys I update for info, though with the above changes, usually 2-4 of them actually get updated with high frequency. I am running out of ideas to satisfy the throttling thresholds to accurately display metadata. I could use some advice. Thanks.
Replies
4
Boosts
1
Views
255
Activity
May ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
Replies
4
Boosts
0
Views
532
Activity
Aug ’25