CNAssetSpatialAudioInfo / Audio Mix rejects ProRes spatial captures (LPCM FOA) — only HEVC (APAC) is eligible. Intended? And how can Audio Mix coexist with ProRes recording?

On iOS/macOS 26, CNAssetSpatialAudioInfo(asset:) and CNAssetSpatialAudioInfo.assetContainsSpatialAudio(asset:) accept a spatial capture only when its spatial track is APAC-encoded which AVCaptureMovieFileOutput produces when the video codec is HEVC. An otherwise-identical ProRes capture, whose spatial track is LPCM (4-channel First-Order Ambisonics, kAudioChannelLayoutTag_HOA_ACN_SN3D | 4), is rejected with CNCinematicErrorDomain code 3 (CNCinematicErrorCodeIncomplete) "no eligible audio tracks in asset".

This reproduces with Apple's own SpatialAudioCLI sample run on Apple's own stock iPhone captures, so it appears to be a property of the format/API rather than my code.

I'd like to confirm whether this is intended, and find a supported way to obtain Audio-Mix-eligible spatial audio while still recording ProRes video (we use Apple Log/ProRes for color grading). Respectively, is wiring a manual AVAssetWriter setup the only way to manage spatial audio and ProRes video?


Eligibility appears to require an APAC-encoded, exactly-4-channel FOA track. Because AVCaptureMovieFileOutput only writes APAC audio for HEVC (ProRes forces LPCM), ProRes spatial captures are never eligible — including Apple's own ProRes stock captures, which SpatialAudioCLI also rejects.


Key finding: eligibility seems baked into the native APAC bitstream

Starting from an eligible HEVC/APAC file, I used AVAssetReader/AVAssetWriter to re-encode only the FOA track (APAC → LPCM → APAC), leaving the AAC stereo track, the HEVC video, and the timed-metadata track untouched. The structurally-identical output is then rejected (code 3). Preserving the cinematic-audio metadata track is not sufficient. Re-encoding the APAC itself loses eligibility. This suggests the mix metadata that gates eligibility is carried inside the APAC bitstream and is produced only at capture time.

Questions

Is it intended that ProRes (LPCM FOA) spatial captures are not Audio-Mix-eligible via CNAssetSpatialAudioInfo, while HEVC (APAC FOA) captures are? Is this documented?

Where exactly is the eligibility metadata stored — in the APAC bitstream, or in the cinematic-audio timed-metadata track (Re-encoding the APAC while preserving that metadata track still loses eligibility)?

Is there any supported way to make an existing LPCM/ProRes FOA capture eligible after the fact (a transcode/encode path that produces the required APAC), or is native capture the only source?

Any guidance, or a pointer to documentation, would be greatly appreciated. Thank you.


Environment

iPhone 16 Pro Max, iOS 26.x; macOS 26.2; Xcode 26.2.


Answered by DTS Engineer in 895328022

Thanks for the detailed write-up.

A few things to share, with explicit notes on which parts of your three questions DTS can address from public material and which parts are policy-bounded.

What the public material confirms

  • WWDC 2025 session 251 ("Enhance your app's audio recording capabilities") states directly that during ProRes recording, the spatial-audio capture pipeline encodes the audio tracks as PCM rather than APAC. The relevant passage is in the Spatial Audio capturing chapter of the transcript:

    https://developer.apple.com/videos/play/wwdc2025/251

  • The same session covers Apple's documented post-capture metadata story: after a spatial-audio recording stops, the capture process analyzes the audio and saves tuning parameters into a metadata track. The header for CNAssetSpatialAudioInfo.spatialAudioMixMetadata describes the property as "the result of audio analysis during recording which contains metadata necessary to properly configure the Audio Mix feature."

  • The CNAssetSpatialAudioInfo documentation describes its initializer as creating an instance "if it meets all requirements" and assetContainsSpatialAudio(asset:) as checking whether an asset "meets all the requirements to operate with Spatial Audio and its accompanying effects." What constitutes those requirements is not enumerated in the public documentation:

    https://developer.apple.com/documentation/cinematic/cnassetspatialaudioinfo-7hdev

  • CNCinematicErrorCodeIncomplete (code 3) is documented in the framework header as "missing needed information," which is what the localized "no eligible audio tracks in asset" message reports.

How that relates to what you describe

The format pairing you observed (ProRes with LPCM, HEVC with APAC) is consistent with the WWDC 2025 transcript above. AVAssetExportSession follows the same pattern at the export-preset level: AVAssetExportPresetAppleProRes422LPCM and AVAssetExportPresetAppleProRes4444LPCM are documented as ProRes plus LPCM, with no APAC-pairing preset shipped.

You are not the only developer hitting this path through CNAssetSpatialAudioInfo. Another thread reports the same "no eligible audio tracks in asset" error after a different transformation (Photos-app editing of an iPhone 16 spatial capture):

https://developer.apple.com/forums/thread/802641

That thread is unanswered as of this writing. The shared pattern across both: any post-capture transformation that re-encodes or re-renders the spatial audio appears to take the resulting asset out of the eligibility set. This holds even when the resulting file is structurally similar to a native capture.

Items DTS cannot address from public material

Two parts of your questions are policy-bounded for DTS responses:

  • "Is this intended behavior?" DTS does not state design intent. We can describe what the public documentation commits to and what it doesn't, but characterizing what Apple intended is outside our scope.
  • "Where exactly is the eligibility metadata stored: APAC bitstream or metadata track?" This is an internal-implementation question. DTS does not make claims about the internal storage or layout of system-managed data beyond what the public APIs commit to. The header for spatialAudioMixMetadata states the property is the result of capture-time audio analysis. WWDC 2025 session 251 says the tuning parameters are saved into a metadata track. Beyond those statements, the gate that CNAssetSpatialAudioInfo applies is internal and not something we can comment on. Your empirical finding that APAC re-encoding while preserving the metadata track loses eligibility is a useful observation for engineering, but we cannot ratify a specific internal-storage explanation.

On your third question

The publicly documented APIs for spatial audio all assume live capture rather than retrofitting an existing asset:

  • AVCaptureMultichannelAudioMode.firstOrderAmbisonics set on AVCaptureDeviceInput.multichannelAudioMode is the documented way to enable spatial-audio capture during recording.
  • AVCaptureSpatialAudioMetadataSampleGenerator analyzes live capture audio buffers and produces a timed-metadata sample for the AVAssetWriter pipeline. The class is iOS-only (API_UNAVAILABLE(macos, macCatalyst, tvos, visionos)) and is described as part of the recording pipeline. It is not documented as a way to retrofit metadata onto an existing file.
  • The reader/writer-settings methods on CNAssetSpatialAudioInfo (assetReaderOutputSettings(for:) and assetWriterInputSettings(for:)) operate on already-eligible assets to apply the audio-mix effect to LPCM output. They do not produce eligibility on a non-eligible source.

Based on the publicly documented APIs, there is no documented post-hoc transcoding or encoding path that takes an existing LPCM-FOA capture and produces an asset that satisfies CNAssetSpatialAudioInfo's eligibility check. Native capture in HEVC mode (which produces the APAC-encoded spatial track plus the metadata sample) is the documented route to an eligible asset.

Filing a Feedback Assistant report

A Feedback Assistant report against the Cinematic / AVFoundation components is the natural next step here. Useful contents:

  • Two captures of the same scene, one ProRes and one HEVC, both with FOA enabled, with annotation about which one passes CNAssetSpatialAudioInfo and which does not.
  • Your re-encode experiment demonstrating loss of eligibility while preserving the metadata track.
  • An explicit request for documentation about which capture configurations produce eligible spatial-audio assets, and (separately) a request for a supported transcoding path from ProRes-LPCM-FOA to an eligible format.
  • A cross-reference to https://developer.apple.com/forums/thread/802641 since that report demonstrates the same eligibility-rejection pattern from a different transformation, and indicates more than one developer is affected.

https://developer.apple.com/feedback-assistant/

For new captures where the workflow allows it, the AVAssetWriter capture path using AVCaptureSpatialAudioMetadataSampleGenerator on iOS is the documented alternative for fine-grained control of the spatial-audio recording pipeline. It produces the APAC-track-plus-metadata structure that CNAssetSpatialAudioInfo accepts. It does not help with ProRes captures (which the documented capture pipeline writes as LPCM) or with files already on disk.

Thanks for the detailed write-up.

A few things to share, with explicit notes on which parts of your three questions DTS can address from public material and which parts are policy-bounded.

What the public material confirms

  • WWDC 2025 session 251 ("Enhance your app's audio recording capabilities") states directly that during ProRes recording, the spatial-audio capture pipeline encodes the audio tracks as PCM rather than APAC. The relevant passage is in the Spatial Audio capturing chapter of the transcript:

    https://developer.apple.com/videos/play/wwdc2025/251

  • The same session covers Apple's documented post-capture metadata story: after a spatial-audio recording stops, the capture process analyzes the audio and saves tuning parameters into a metadata track. The header for CNAssetSpatialAudioInfo.spatialAudioMixMetadata describes the property as "the result of audio analysis during recording which contains metadata necessary to properly configure the Audio Mix feature."

  • The CNAssetSpatialAudioInfo documentation describes its initializer as creating an instance "if it meets all requirements" and assetContainsSpatialAudio(asset:) as checking whether an asset "meets all the requirements to operate with Spatial Audio and its accompanying effects." What constitutes those requirements is not enumerated in the public documentation:

    https://developer.apple.com/documentation/cinematic/cnassetspatialaudioinfo-7hdev

  • CNCinematicErrorCodeIncomplete (code 3) is documented in the framework header as "missing needed information," which is what the localized "no eligible audio tracks in asset" message reports.

How that relates to what you describe

The format pairing you observed (ProRes with LPCM, HEVC with APAC) is consistent with the WWDC 2025 transcript above. AVAssetExportSession follows the same pattern at the export-preset level: AVAssetExportPresetAppleProRes422LPCM and AVAssetExportPresetAppleProRes4444LPCM are documented as ProRes plus LPCM, with no APAC-pairing preset shipped.

You are not the only developer hitting this path through CNAssetSpatialAudioInfo. Another thread reports the same "no eligible audio tracks in asset" error after a different transformation (Photos-app editing of an iPhone 16 spatial capture):

https://developer.apple.com/forums/thread/802641

That thread is unanswered as of this writing. The shared pattern across both: any post-capture transformation that re-encodes or re-renders the spatial audio appears to take the resulting asset out of the eligibility set. This holds even when the resulting file is structurally similar to a native capture.

Items DTS cannot address from public material

Two parts of your questions are policy-bounded for DTS responses:

  • "Is this intended behavior?" DTS does not state design intent. We can describe what the public documentation commits to and what it doesn't, but characterizing what Apple intended is outside our scope.
  • "Where exactly is the eligibility metadata stored: APAC bitstream or metadata track?" This is an internal-implementation question. DTS does not make claims about the internal storage or layout of system-managed data beyond what the public APIs commit to. The header for spatialAudioMixMetadata states the property is the result of capture-time audio analysis. WWDC 2025 session 251 says the tuning parameters are saved into a metadata track. Beyond those statements, the gate that CNAssetSpatialAudioInfo applies is internal and not something we can comment on. Your empirical finding that APAC re-encoding while preserving the metadata track loses eligibility is a useful observation for engineering, but we cannot ratify a specific internal-storage explanation.

On your third question

The publicly documented APIs for spatial audio all assume live capture rather than retrofitting an existing asset:

  • AVCaptureMultichannelAudioMode.firstOrderAmbisonics set on AVCaptureDeviceInput.multichannelAudioMode is the documented way to enable spatial-audio capture during recording.
  • AVCaptureSpatialAudioMetadataSampleGenerator analyzes live capture audio buffers and produces a timed-metadata sample for the AVAssetWriter pipeline. The class is iOS-only (API_UNAVAILABLE(macos, macCatalyst, tvos, visionos)) and is described as part of the recording pipeline. It is not documented as a way to retrofit metadata onto an existing file.
  • The reader/writer-settings methods on CNAssetSpatialAudioInfo (assetReaderOutputSettings(for:) and assetWriterInputSettings(for:)) operate on already-eligible assets to apply the audio-mix effect to LPCM output. They do not produce eligibility on a non-eligible source.

Based on the publicly documented APIs, there is no documented post-hoc transcoding or encoding path that takes an existing LPCM-FOA capture and produces an asset that satisfies CNAssetSpatialAudioInfo's eligibility check. Native capture in HEVC mode (which produces the APAC-encoded spatial track plus the metadata sample) is the documented route to an eligible asset.

Filing a Feedback Assistant report

A Feedback Assistant report against the Cinematic / AVFoundation components is the natural next step here. Useful contents:

  • Two captures of the same scene, one ProRes and one HEVC, both with FOA enabled, with annotation about which one passes CNAssetSpatialAudioInfo and which does not.
  • Your re-encode experiment demonstrating loss of eligibility while preserving the metadata track.
  • An explicit request for documentation about which capture configurations produce eligible spatial-audio assets, and (separately) a request for a supported transcoding path from ProRes-LPCM-FOA to an eligible format.
  • A cross-reference to https://developer.apple.com/forums/thread/802641 since that report demonstrates the same eligibility-rejection pattern from a different transformation, and indicates more than one developer is affected.

https://developer.apple.com/feedback-assistant/

For new captures where the workflow allows it, the AVAssetWriter capture path using AVCaptureSpatialAudioMetadataSampleGenerator on iOS is the documented alternative for fine-grained control of the spatial-audio recording pipeline. It produces the APAC-track-plus-metadata structure that CNAssetSpatialAudioInfo accepts. It does not help with ProRes captures (which the documented capture pipeline writes as LPCM) or with files already on disk.

CNAssetSpatialAudioInfo / Audio Mix rejects ProRes spatial captures (LPCM FOA) — only HEVC (APAC) is eligible. Intended? And how can Audio Mix coexist with ProRes recording?
 
 
Q