Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Live Translations on VOIP on iOS26
Hi team, With regards to Call (Live) Translations on VOIP: Is it possible to invoke live translations within the app? (without going into the Call System UI) Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly) Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
1
0
167
Aug ’25
SpeechTranscriber extremely slow (14+ seconds) despite proper locale allocation and optimization
Using the official SwiftTranscriptionSampleApp from WWDC 2025, speech transcription takes 14+ seconds from audio input to first result, making it unusable for real-time applications. Environment iOS: 26.0 Beta Xcode: Beta 5 Device: iPhone 16 pro Sample App: Official Apple SwiftTranscriptionSampleApp from WWDC 2025 Configuration Tested Locale: en-US (properly allocated with AssetInventory.allocate(locale:)) and es-ES Setup: All optimizations applied (preheating, high priority, model retention) I started testing in my own app to replace SFSpeech API and include speech detection but after long fights with documentation (this part is quite terrible TBH) I tested the example (https://developer.apple.com/documentation/speech/bringing-advanced-speech-to-text-capabilities-to-your-app) and saw same results. I added some logs to check the specific time: 🎙️ [20:30:41.532] ✅ Analyzer started successfully - ready to receive audio! 🎙️ [20:30:41.532] Listening for transcription results... 🎙️ [20:30:56.342] 🚀 FIRST TRANSCRIPTION RESULT after 14.810s: 'Hello' (isFinal: false) Questions Is this expected performance for iOS 26 Beta, because old SFSpeech is far faster? Are there additional optimization steps for SpeechTranscriber? Should we expect significant performance improvements in later betas?
1
0
226
Aug ’25
Audio clipping - macOS Tahoe 26 - Beta 5
I was testing audio playback from YouTube in Safari, and the sound was clipping heavily. At first, I thought it might be due to the poor quality of my small sound system. However, when I took a screenshot and the screenshot sound effect itself produced a loud clipping noise, it became clear that this is not a mechanical problem with my speakers, nor an issue specific to YouTube or Safari. This appears to be a system-wide audio issue in macOS Tahoe 26 - Beta 5.
1
2
313
Aug ’25
SpeechAnalyzer.start(inputSequence:) fails with _GenericObjCError nilError, while the same WAV succeeds with start(inputAudioFile:)
I'm trying to use the new Speech framework for streaming transcription on macOS 26.3, and I can reproduce a failure with SpeechAnalyzer.start(inputSequence:). What is working: SpeechAnalyzer + SpeechTranscriber offline path using start(inputAudioFile:finishAfterFile:) same Spanish WAV file transcribes successfully and returns a coherent final result What is not working: SpeechAnalyzer + SpeechTranscriber stream path using start(inputSequence:) same WAV, replayed as AnalyzerInput(buffer:bufferStartTime:) fails once replay starts with: _GenericObjCError domain=Foundation._GenericObjCError code=0 detail=nilError I also tried: DictationTranscriber instead of SpeechTranscriber no realtime pacing during replay Both still fail in stream mode with the same error. So this does not currently look like a ScreenCaptureKit issue or a Python integration issue. I reduced it to a pure Swift CLI repro. Environment: macOS 26.3 (25D122) Xcode 26.3 Swift 6.2.4 Apple Silicon Mac Has anyone here gotten SpeechAnalyzer.start(inputSequence:) working reliably on macOS 26.x? If so, I'd be interested in any workaround or any detail that differs from the obvious setup: prepareToAnalyze(in:) bestAvailableAudioFormat(...) AnalyzerInput(buffer:bufferStartTime:) replaying a known-good WAV in chunks I already filed Feedback Assistant: FB22149971
1
0
326
3d
Can't set AVAudio sampleRate and installTap needs bufferSize 4800 at minimum
Two issues: No matter what I set in try audioSession.setPreferredSampleRate(x) the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad. Now, I'm checking the current output loudness to animate a 3D character, using mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in Task { @MainActor in // calculate rms and animate character accordingly but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized. This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results. But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame. My AVAudioEngine setup is the following: audioEngine.connect(playerNode, to: pitchShiftEffect, format: format) audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format) audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil) Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second. PS: Specifying my favorite format in the let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)! mixerNode.installTap(onBus: 0, bufferSize: y, format: format) doesn't change anything either
1
0
450
Aug ’25
MusicKit: Multichannel Dolby Atmos Limited to Stereo Output - Is This Intended Behavior?
I'm experiencing a significant limitation with MusicKit's Dolby Atmos implementation on macOS and would appreciate clarification on whether this is intended behavior or if there are solutions available. When streaming Dolby Atmos content through MusicKit's ApplicationMusicPlayer, the output is limited to 2-channel stereo, even when: Audio MIDI Setup is configured for 7.1.4 (12-channel) output The same tracks play in full multichannel through the native Apple Music app Dolby Atmos is set to "Automatic" in Apple Music preferences Please let me know if there is anyway to enable this. If not, is this documented anywhere? Thanks!
1
2
302
Aug ’25
sysEx struct in CoreMIDI/MIDIMessages.h
The sysEx struct in the MIDIUniversalMessage struct has a channel member but the System Exclusive (7-Bit) Message doesn't have a channel field. The System Exclusive (7-Bit) Message has a # of bytes field but the sysEx struct doesn't have a nrOfBytes, byteCount or bytesUsed member. It looks like the channel member of the sysEx struct contains the number of used bytes. Is this a mistake in the header or did I misunderstand something?
1
0
587
Dec ’25
Detecting if a phone call is being recorded by another app on iOS
Hello, I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background. I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way. My questions are: Does iOS expose any API to detect if a call is being recorded? If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon? Or is this something that iOS explicitly prevents for privacyreasons? Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines. Thanks in advance for any guidance.
1
0
272
Aug ’25
Some questions about musickit
We are developing an apple music app on phone, the developed web works fine on chrome, but when i load it on webivew on my phone, i can't play the first song, We doubt that the drm init, key exchange, session creation was on the music.play() function, while we trigger the play, the drm or session was not ok for play a real song, so it got an error so we may wanna know: what about the realative process of drm, key, session, etc in the play() function? are there some state detect function to show weather the drm is ok?
1
0
160
Mar ’25
SpeechAnalyzer > AnalysisContext lack of documentation
I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).
1
0
496
Jan ’26
MusicKit playbackTime Accuracy
Hello, Has anyone else experienced variations in the accuracy of the playbackTime value? After a few seconds of playback, the reported time adjusts by a fraction of a second, making it difficult to calculate the actual playbackTime of the audio. This can be recreated by playing a song in MusicKit, recording the start time of the audio, playing for at least 10-20 seconds, and then comparing the playbackTime value to one calculated using the start time of the audio. In my experience this jump occurs after about 10 seconds of playback. Any help would be appreciated. Thanks!
1
0
131
May ’25
Not able to write AAC audio with 96 kHz sample rate using AVAudioRecorder or Extended audio file services
Not able to record audio in AAC format with 96 kHz sample rate using AVAudioRecorder or Extended Audio File services with 96 kHz input audio from input device. The audio recording settings used are let settings: [String: Any] = [ AVFormatIDKey: Int(kAudioFormatMPEG4AAC), AVSampleRateKey: sampleRate AVNumberOfChannelsKey: 1 AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue ] When tried using AVAudioEngine using AVAudioFile, AVAudioFile(forWriting: fileURL, // file extension .m4a settings: fileSettings, commonFormat: AVAudioCommonFormat.pcmFormatFloat32, interleaved: interleaved) else { return } got error CodecConverterFactory.cpp:977 unable to select compatible encoder sample rate AudioConverter.cpp:1017 Failed to create a new in process converter -> from 1 ch, 96000 Hz, Float32 to 1 ch, 96000 Hz, aac (0x00000000) 0 bits/channel, 0 bytes/packet, 0 frames/packet, 0 bytes/frame, with status 1718449215
1
0
567
Nov ’25
Audio driver based on AudioDriverKit sometimes hangs after sleep
Dear Sirs, I’ve written a virtual audio driver based on AudioDriverKit and running as dext in my MacOS app. Sometimes when waking up from a sleep state the recording side of my driver extension seems to hang and I don’t see any calls to my io_operation callback. Then the recording app like a DAW seems to hang when trying to start a recording. This doesn’t happen after short sleep states or after a complete new start of my MacBook. I already opened a case in Feedback-Assistant on 5th of May (FB17503622) which also includes a sysdiagnose and a ktrace but I didn't get any feedback so far. Meanwhile some of our customers are getting angry and I'd like to know if there's anything I could do to fix this problem on my side. We’re not sure whether this worked in previous MacOS versions, we think we didn’t observe this before 15.3.1 but at least since 15.3.1. we’ve seen this problem. Best regards, Johannes
1
0
196
Aug ’25
SpeechTranscriber supported Devices
I have the new iOS 26 SpeechTranscriber working in my application. The issue I am facing is how to determine if the device I am running on supports SpeechTranscriber. I was able to create code that tests if the device supports transcription but it takes a bit of time to run and thus the results are not available when the app launches. What I am looking for is a list of what iOS 26 devices it doesn't run on. I think its safe to assume any new devices will support it so if we can just have a list of what devices that can run iOS 26 and not able to do transcription it would be much faster for the app. I have determined it doesn't work on a SE 2nd Gen, it works on iPhone 12, SE 3rd Gen, iPhone 14 Pro, 15 Pro. As the SpeechTranscriber doesn't work in the simulator I can't determine that way. I have checked the docs and it doesn't list the devices it doesn't work on.
1
0
532
Nov ’25
Mic audio before and after a call is answered
I have an app that records a health provider’s conversation with a patient. I am using Audio Queue Services for this. If a phone call comes in while recording, the doctor wants to be able to ignore the call and continue the conversation without touching the phone. If the doctor answers the call, that’s fine – I will stop the recording. I can detect when the call comes in and ends using CXCallObserver and AVAudioSession.interruptionNotification. Unfortunately, when a call comes in and before it is answered or dismissed, the audio is suppressed. After the call is dismissed, the audio continues to be suppressed. How can I continue to get audio from the mic as long as the user does not answer the phone call?
1
0
74
May ’25
Unexpected AVAudioSession behavior after iOS 18.5 causing audio loss in VoIP calls
After updating to iOS 18.5, we’ve observed that outgoing audio from our app intermittently stops being transmitted during VoIP calls using AVAudioSession configured with .playAndRecord and .voiceChat. The session is set active without errors, and interruptions are handled correctly, yet audio capture suddenly ceases mid-call. This was not observed in earlier iOS versions (≤ 18.4). We’d like to confirm if there have been any recent changes in AVAudioSession, CallKit, or related media handling that could affect audio input behavior during long-running calls. func configureForVoIPCall() throws { try setCategory( .playAndRecord, mode: .voiceChat, options: [.allowBluetooth, .allowBluetoothA2DP, .defaultToSpeaker]) try setActive(true) }
1
0
284
Aug ’25
Start and stop recording Voice Memos with Siri
using iOS 26.2; Airpods 4 Long press stem to launch Siri Speak "Record Voice Memo" -> Recording starts Recording in progress... Long press stem to launch Siri -> Nothing happens. To stop recording need use phone. is this intended behaviour? i would like to be able to stop recording with Siri I am able to launch Siri from phone while recording, but point is to keep phone in pocket and start/stop recordings only via Airpods.
1
0
184
Dec ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
1
0
169
Jul ’25
Why does AVAudioRecorder show 8 kHz when iPhone hardware is 48 kHz?
Hi everyone, I’m testing audio recording on an iPhone 15 Plus using AVFoundation. Here’s a simplified version of my setup: let settings: [String: Any] = [ AVFormatIDKey: Int(kAudioFormatLinearPCM), AVSampleRateKey: 8000, AVNumberOfChannelsKey: 1, AVLinearPCMBitDepthKey: 16, AVLinearPCMIsFloatKey: false ] audioRecorder = try AVAudioRecorder(url: fileURL, settings: settings) audioRecorder?.record() When I check the recorded file’s sample rate, it logs: Actual sample rate: 8000.0 However, when I inspect the hardware sample rate: try session.setCategory(.playAndRecord, mode: .default) try session.setActive(true) print("Hardware sample rate:", session.sampleRate) I consistently get: `Hardware sample rate: 48000.0 My questions are: Is the iPhone mic actually capturing at 8 kHz, or is it recording at 48 kHz and then downsampling to 8 kHz internally? Is there any way to force the hardware to record natively at 8 kHz? If not, what’s the recommended approach for telephony-quality audio (true 8 kHz) on iOS devices? Thanks in advance for your guidance!
1
0
269
Sep ’25
Live Translations on VOIP on iOS26
Hi team, With regards to Call (Live) Translations on VOIP: Is it possible to invoke live translations within the app? (without going into the Call System UI) Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly) Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
Replies
1
Boosts
0
Views
167
Activity
Aug ’25
SpeechTranscriber extremely slow (14+ seconds) despite proper locale allocation and optimization
Using the official SwiftTranscriptionSampleApp from WWDC 2025, speech transcription takes 14+ seconds from audio input to first result, making it unusable for real-time applications. Environment iOS: 26.0 Beta Xcode: Beta 5 Device: iPhone 16 pro Sample App: Official Apple SwiftTranscriptionSampleApp from WWDC 2025 Configuration Tested Locale: en-US (properly allocated with AssetInventory.allocate(locale:)) and es-ES Setup: All optimizations applied (preheating, high priority, model retention) I started testing in my own app to replace SFSpeech API and include speech detection but after long fights with documentation (this part is quite terrible TBH) I tested the example (https://developer.apple.com/documentation/speech/bringing-advanced-speech-to-text-capabilities-to-your-app) and saw same results. I added some logs to check the specific time: 🎙️ [20:30:41.532] ✅ Analyzer started successfully - ready to receive audio! 🎙️ [20:30:41.532] Listening for transcription results... 🎙️ [20:30:56.342] 🚀 FIRST TRANSCRIPTION RESULT after 14.810s: 'Hello' (isFinal: false) Questions Is this expected performance for iOS 26 Beta, because old SFSpeech is far faster? Are there additional optimization steps for SpeechTranscriber? Should we expect significant performance improvements in later betas?
Replies
1
Boosts
0
Views
226
Activity
Aug ’25
Audio clipping - macOS Tahoe 26 - Beta 5
I was testing audio playback from YouTube in Safari, and the sound was clipping heavily. At first, I thought it might be due to the poor quality of my small sound system. However, when I took a screenshot and the screenshot sound effect itself produced a loud clipping noise, it became clear that this is not a mechanical problem with my speakers, nor an issue specific to YouTube or Safari. This appears to be a system-wide audio issue in macOS Tahoe 26 - Beta 5.
Replies
1
Boosts
2
Views
313
Activity
Aug ’25
SpeechAnalyzer.start(inputSequence:) fails with _GenericObjCError nilError, while the same WAV succeeds with start(inputAudioFile:)
I'm trying to use the new Speech framework for streaming transcription on macOS 26.3, and I can reproduce a failure with SpeechAnalyzer.start(inputSequence:). What is working: SpeechAnalyzer + SpeechTranscriber offline path using start(inputAudioFile:finishAfterFile:) same Spanish WAV file transcribes successfully and returns a coherent final result What is not working: SpeechAnalyzer + SpeechTranscriber stream path using start(inputSequence:) same WAV, replayed as AnalyzerInput(buffer:bufferStartTime:) fails once replay starts with: _GenericObjCError domain=Foundation._GenericObjCError code=0 detail=nilError I also tried: DictationTranscriber instead of SpeechTranscriber no realtime pacing during replay Both still fail in stream mode with the same error. So this does not currently look like a ScreenCaptureKit issue or a Python integration issue. I reduced it to a pure Swift CLI repro. Environment: macOS 26.3 (25D122) Xcode 26.3 Swift 6.2.4 Apple Silicon Mac Has anyone here gotten SpeechAnalyzer.start(inputSequence:) working reliably on macOS 26.x? If so, I'd be interested in any workaround or any detail that differs from the obvious setup: prepareToAnalyze(in:) bestAvailableAudioFormat(...) AnalyzerInput(buffer:bufferStartTime:) replaying a known-good WAV in chunks I already filed Feedback Assistant: FB22149971
Replies
1
Boosts
0
Views
326
Activity
3d
Can't set AVAudio sampleRate and installTap needs bufferSize 4800 at minimum
Two issues: No matter what I set in try audioSession.setPreferredSampleRate(x) the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad. Now, I'm checking the current output loudness to animate a 3D character, using mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in Task { @MainActor in // calculate rms and animate character accordingly but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized. This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results. But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame. My AVAudioEngine setup is the following: audioEngine.connect(playerNode, to: pitchShiftEffect, format: format) audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format) audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil) Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second. PS: Specifying my favorite format in the let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)! mixerNode.installTap(onBus: 0, bufferSize: y, format: format) doesn't change anything either
Replies
1
Boosts
0
Views
450
Activity
Aug ’25
MusicKit: Multichannel Dolby Atmos Limited to Stereo Output - Is This Intended Behavior?
I'm experiencing a significant limitation with MusicKit's Dolby Atmos implementation on macOS and would appreciate clarification on whether this is intended behavior or if there are solutions available. When streaming Dolby Atmos content through MusicKit's ApplicationMusicPlayer, the output is limited to 2-channel stereo, even when: Audio MIDI Setup is configured for 7.1.4 (12-channel) output The same tracks play in full multichannel through the native Apple Music app Dolby Atmos is set to "Automatic" in Apple Music preferences Please let me know if there is anyway to enable this. If not, is this documented anywhere? Thanks!
Replies
1
Boosts
2
Views
302
Activity
Aug ’25
Audio System Trace: Zero Time Stamp
In Instruments, I'm seeing "Zero Time Stamp" events in the "Audio Server" lane. What does that mean?
Replies
1
Boosts
0
Views
166
Activity
2w
sysEx struct in CoreMIDI/MIDIMessages.h
The sysEx struct in the MIDIUniversalMessage struct has a channel member but the System Exclusive (7-Bit) Message doesn't have a channel field. The System Exclusive (7-Bit) Message has a # of bytes field but the sysEx struct doesn't have a nrOfBytes, byteCount or bytesUsed member. It looks like the channel member of the sysEx struct contains the number of used bytes. Is this a mistake in the header or did I misunderstand something?
Replies
1
Boosts
0
Views
587
Activity
Dec ’25
Detecting if a phone call is being recorded by another app on iOS
Hello, I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background. I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way. My questions are: Does iOS expose any API to detect if a call is being recorded? If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon? Or is this something that iOS explicitly prevents for privacyreasons? Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines. Thanks in advance for any guidance.
Replies
1
Boosts
0
Views
272
Activity
Aug ’25
Some questions about musickit
We are developing an apple music app on phone, the developed web works fine on chrome, but when i load it on webivew on my phone, i can't play the first song, We doubt that the drm init, key exchange, session creation was on the music.play() function, while we trigger the play, the drm or session was not ok for play a real song, so it got an error so we may wanna know: what about the realative process of drm, key, session, etc in the play() function? are there some state detect function to show weather the drm is ok?
Replies
1
Boosts
0
Views
160
Activity
Mar ’25
SpeechAnalyzer > AnalysisContext lack of documentation
I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).
Replies
1
Boosts
0
Views
496
Activity
Jan ’26
MusicKit playbackTime Accuracy
Hello, Has anyone else experienced variations in the accuracy of the playbackTime value? After a few seconds of playback, the reported time adjusts by a fraction of a second, making it difficult to calculate the actual playbackTime of the audio. This can be recreated by playing a song in MusicKit, recording the start time of the audio, playing for at least 10-20 seconds, and then comparing the playbackTime value to one calculated using the start time of the audio. In my experience this jump occurs after about 10 seconds of playback. Any help would be appreciated. Thanks!
Replies
1
Boosts
0
Views
131
Activity
May ’25
Not able to write AAC audio with 96 kHz sample rate using AVAudioRecorder or Extended audio file services
Not able to record audio in AAC format with 96 kHz sample rate using AVAudioRecorder or Extended Audio File services with 96 kHz input audio from input device. The audio recording settings used are let settings: [String: Any] = [ AVFormatIDKey: Int(kAudioFormatMPEG4AAC), AVSampleRateKey: sampleRate AVNumberOfChannelsKey: 1 AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue ] When tried using AVAudioEngine using AVAudioFile, AVAudioFile(forWriting: fileURL, // file extension .m4a settings: fileSettings, commonFormat: AVAudioCommonFormat.pcmFormatFloat32, interleaved: interleaved) else { return } got error CodecConverterFactory.cpp:977 unable to select compatible encoder sample rate AudioConverter.cpp:1017 Failed to create a new in process converter -> from 1 ch, 96000 Hz, Float32 to 1 ch, 96000 Hz, aac (0x00000000) 0 bits/channel, 0 bytes/packet, 0 frames/packet, 0 bytes/frame, with status 1718449215
Replies
1
Boosts
0
Views
567
Activity
Nov ’25
Audio driver based on AudioDriverKit sometimes hangs after sleep
Dear Sirs, I’ve written a virtual audio driver based on AudioDriverKit and running as dext in my MacOS app. Sometimes when waking up from a sleep state the recording side of my driver extension seems to hang and I don’t see any calls to my io_operation callback. Then the recording app like a DAW seems to hang when trying to start a recording. This doesn’t happen after short sleep states or after a complete new start of my MacBook. I already opened a case in Feedback-Assistant on 5th of May (FB17503622) which also includes a sysdiagnose and a ktrace but I didn't get any feedback so far. Meanwhile some of our customers are getting angry and I'd like to know if there's anything I could do to fix this problem on my side. We’re not sure whether this worked in previous MacOS versions, we think we didn’t observe this before 15.3.1 but at least since 15.3.1. we’ve seen this problem. Best regards, Johannes
Replies
1
Boosts
0
Views
196
Activity
Aug ’25
SpeechTranscriber supported Devices
I have the new iOS 26 SpeechTranscriber working in my application. The issue I am facing is how to determine if the device I am running on supports SpeechTranscriber. I was able to create code that tests if the device supports transcription but it takes a bit of time to run and thus the results are not available when the app launches. What I am looking for is a list of what iOS 26 devices it doesn't run on. I think its safe to assume any new devices will support it so if we can just have a list of what devices that can run iOS 26 and not able to do transcription it would be much faster for the app. I have determined it doesn't work on a SE 2nd Gen, it works on iPhone 12, SE 3rd Gen, iPhone 14 Pro, 15 Pro. As the SpeechTranscriber doesn't work in the simulator I can't determine that way. I have checked the docs and it doesn't list the devices it doesn't work on.
Replies
1
Boosts
0
Views
532
Activity
Nov ’25
Mic audio before and after a call is answered
I have an app that records a health provider’s conversation with a patient. I am using Audio Queue Services for this. If a phone call comes in while recording, the doctor wants to be able to ignore the call and continue the conversation without touching the phone. If the doctor answers the call, that’s fine – I will stop the recording. I can detect when the call comes in and ends using CXCallObserver and AVAudioSession.interruptionNotification. Unfortunately, when a call comes in and before it is answered or dismissed, the audio is suppressed. After the call is dismissed, the audio continues to be suppressed. How can I continue to get audio from the mic as long as the user does not answer the phone call?
Replies
1
Boosts
0
Views
74
Activity
May ’25
Unexpected AVAudioSession behavior after iOS 18.5 causing audio loss in VoIP calls
After updating to iOS 18.5, we’ve observed that outgoing audio from our app intermittently stops being transmitted during VoIP calls using AVAudioSession configured with .playAndRecord and .voiceChat. The session is set active without errors, and interruptions are handled correctly, yet audio capture suddenly ceases mid-call. This was not observed in earlier iOS versions (≤ 18.4). We’d like to confirm if there have been any recent changes in AVAudioSession, CallKit, or related media handling that could affect audio input behavior during long-running calls. func configureForVoIPCall() throws { try setCategory( .playAndRecord, mode: .voiceChat, options: [.allowBluetooth, .allowBluetoothA2DP, .defaultToSpeaker]) try setActive(true) }
Replies
1
Boosts
0
Views
284
Activity
Aug ’25
Start and stop recording Voice Memos with Siri
using iOS 26.2; Airpods 4 Long press stem to launch Siri Speak "Record Voice Memo" -> Recording starts Recording in progress... Long press stem to launch Siri -> Nothing happens. To stop recording need use phone. is this intended behaviour? i would like to be able to stop recording with Siri I am able to launch Siri from phone while recording, but point is to keep phone in pocket and start/stop recordings only via Airpods.
Replies
1
Boosts
0
Views
184
Activity
Dec ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
Replies
1
Boosts
0
Views
169
Activity
Jul ’25
Why does AVAudioRecorder show 8 kHz when iPhone hardware is 48 kHz?
Hi everyone, I’m testing audio recording on an iPhone 15 Plus using AVFoundation. Here’s a simplified version of my setup: let settings: [String: Any] = [ AVFormatIDKey: Int(kAudioFormatLinearPCM), AVSampleRateKey: 8000, AVNumberOfChannelsKey: 1, AVLinearPCMBitDepthKey: 16, AVLinearPCMIsFloatKey: false ] audioRecorder = try AVAudioRecorder(url: fileURL, settings: settings) audioRecorder?.record() When I check the recorded file’s sample rate, it logs: Actual sample rate: 8000.0 However, when I inspect the hardware sample rate: try session.setCategory(.playAndRecord, mode: .default) try session.setActive(true) print("Hardware sample rate:", session.sampleRate) I consistently get: `Hardware sample rate: 48000.0 My questions are: Is the iPhone mic actually capturing at 8 kHz, or is it recording at 48 kHz and then downsampling to 8 kHz internally? Is there any way to force the hardware to record natively at 8 kHz? If not, what’s the recommended approach for telephony-quality audio (true 8 kHz) on iOS devices? Thanks in advance for your guidance!
Replies
1
Boosts
0
Views
269
Activity
Sep ’25