Metal

Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Post

Replies

Boosts

Views

Activity

Background GPU Access availability

I would love to use Background GPU Access to do some video processing in the background. However the documentation of BGContinuedProcessingTaskRequest.Resources.gpu clearly states: Not all devices support background GPU use. For more information, see Performing long-running tasks on iOS and iPadOS. Is there a list available of currently released devices that do (or don't) support GPU background usage? That would help to understand what part of our user base can use this feature. (And what hardware we need to test this on as developers.) For example it seems that it isn't supported on an iPad Pro M1 with the current iOS 26 beta. The simulators also seem to not support the background GPU resource. So would be great to understand what hardware is capable of using this feature!

Graphics & Games Metal Metal Background Tasks MetalFX

1.2k

Timestamp counter heap always returns zero

Hi, I am trying to use a timestamp counter heap, but it always seems to report timestamp zero. Consider this example program: #include <Metal/Metal.h> #include <assert.h> int main(int argc, char *argv[]) { auto device = MTLCreateSystemDefaultDevice(); assert(device); auto descriptor = [MTL4CounterHeapDescriptor new]; [descriptor setType:MTL4CounterHeapTypeTimestamp]; [descriptor setCount:1]; auto heap = [device newCounterHeapWithDescriptor:descriptor error:nullptr]; assert(heap); [heap invalidateCounterRange:NSMakeRange(0, 1)]; auto command_buffer = [device newCommandBuffer]; assert(command_buffer); auto allocator = [device newCommandAllocator]; assert(allocator); [command_buffer beginCommandBufferWithAllocator:allocator]; auto encoder = [command_buffer computeCommandEncoder]; assert(encoder); [encoder writeTimestampWithGranularity:MTL4TimestampGranularityPrecise intoHeap:heap atIndex:0]; [encoder endEncoding]; [command_buffer endCommandBuffer]; auto queue = [device newMTL4CommandQueue]; assert(queue); auto event = [device newSharedEvent]; assert(event); [queue commit:&command_buffer count:1]; [queue signalEvent:event value:1]; [event waitUntilSignaledValue:1 timeoutMS:UINT64_MAX]; auto data = [heap resolveCounterRange:NSMakeRange(0, 1)]; printf("size %lu: %llu\n", data.length, *(uint64_t*)data.bytes); return 0; } Trying to compile and run: % clang++ -g -O0 -o test test.mm -framework Metal -framework Foundation && MTL_DEBUG_LAYER=1 ./test 2026-06-23 14:44:48.006 test[26472:1588857] Metal API Validation Enabled size 8: 0 I would have expected to receive size 8: [some random non-zero number] that number being a GPU timestamp of when the command was executed, but I always get zero. Does anybody have an idea of what I am doing wrong?

Graphics & Games Metal

MDLAsset loads texture in usdz file loaded with wrong colorspace

I have a very basic usdz file from this repo I call loadTextures() after loading the usdz via MDLAsset. Inspecting the MDLTexture object I can tell it is assigning a colorspace of linear rgb instead of srgb although the image file in the usdz is srgb. This causes the textures to ultimately render as over saturated. In the code I later convert the MDLTexture to MTLTexture via MTKTextureLoader but if I set the srgb option it seems to ignore it. This significantly impacts the usefulness of Model I/O if it can't load a simple usdz texture correctly. Am I missing something? Thanks!

Graphics & Games Metal Graphics and Games Metal MetalKit USDZ

Comprehensive documentation and literature

The WWDC videos like the new "Boost your graphics performance with the M5 and A19 GPUs" contain extremely valuable information and tips on how to discover, diagnose and remedy performance issues. They seem to serve as quick reminders and distilled summaries of more comprehensive documentation that I assume can be found somewhere. Where do we find the underlying comprehensive documentation that explains Apple Silicon GPU architecture? How can I learn to understand the basis of the data presented by the Xcode Metal Debugger? Any hints at external literature and resources are welcome.

Graphics & Games Metal

184

Documentation and literature

Graphics & Games Metal

151

Performance Optimization for Large-Kernel Image Processing

I am processing large images where each output pixel depends on a large neighborhood of surrounding pixels. As a result, the shader performs a very high number of texture sampling operations, which appears to cause cache misses and becomes a performance bottleneck. Since neighboring threads often process adjacent pixels, many of the sampled pixels overlap between threads. Although each thread operates on a slightly different output pixel, a large portion of the texture accesses are effectively identical. Does Metal provide mechanisms that allow neighboring threads to share or synchronize intermediate results in order to reduce redundant texture fetches? Are there recommended approaches for exploiting data reuse across threads, for example through threadgroup memory or other Metal-specific features? In this type of workload, how effective is texture gathering (gather) for reducing sampling overhead, especially when only the RGB channels of an RGBA texture are required? Would using gather generally improve cache utilization and performance in this scenario? When using gather, what is the preferred way to handle texture borders and edge conditions without introducing per-thread branching (e.g., explicit if statements)? Any recommendations for optimizing large-radius neighborhood operations in Metal would be greatly appreciated.

Graphics & Games Metal

181

Opportunities to use Apple intelligence.

Are there opportunities for developers to use Apple Intelligence models through Metal in ways that unlock new rendering, simulation, or real-time content generation techniques?

Graphics & Games Metal

173

Memory allocation of textures in Metal

At which time does Metal allocate and deallocate memory for textures? I've observed that the textures live for the whole time of the commandBuffer. So, if I have multiple large textures that I need in subsequent shaders, it would make sense to work with multiple commandBuffers to enable deallocation in order to reduce peak memory usage. Is that correct? Do you have any other suggestions on how to reduce peak memory usage when working with large metal textures? Hint: I am using compute shaders only.

Graphics & Games Metal

191

MetalFX upscaler/denoiser and instant changes

Hi, What's the best way to handle drastic changes in scene charateristics with the new MTLFXTemporalDenoisedScaler? Let's say a visible object of the scene radically changes its material properties. I can modify the albedo and roughness textures consequently. But I suspect the history will be corrupted. Blending visual information between the new frame and the previous ones might be a nonsense. I guess the problem should be the same when objects appear or disappear instantly. Is the upsacler manage these events for us (by lowering blending), or should we use the reactive or the denoise strength mask or something like that to handle them?

Graphics & Games Metal MetalFX

440

metal shader converter library distribution

The documentation is unclear. I need a clarification on metal shader converter library distribution. Am I allowed to distribute the library as a part of my macos app bundle?

Graphics & Games Metal Metal Shader Converter

147

Metal GPU Driver Crash on M5 Pro + macOS 26.5 — kIOGPUCommandBufferCallbackErrorOutOfMemory with <2GB working sets

Metal GPU Driver Crash on M5 Pro + macOS 26.5 — kIOGPUCommandBufferCallbackErrorOutOfMemory with <2GB working sets Summary The Metal driver AGXMetalG17X 351.2 on macOS 26.5 (25F71) for the M5 Pro chip crashes with kIOGPUCommandBufferCallbackErrorOutOfMemory (00000008) when running LLM inference workloads with working sets as small as ~1.5GB, despite 24GB of unified memory being available and Apple Diagnostics confirming the hardware is fully functional. This affects multiple tools: MLX, llama.cpp (Metal backend), and native apps using Metal for inference. System Component Value Model MacBook Pro (Mac17,9) Chip Apple M5 Pro (applegpu_g17s) GPU Cores 16 RAM 24 GB LPDDR5 macOS 26.5 (25F71) Metal Metal 4 GPU Driver AGXMetalG17X 351.2 Xcode 26.5 (17F42) Reproduction MLX (Python) pip install mlx mlx-lm python -m mlx_lm.generate \ --model mlx-community/Qwen2.5-3B-Instruct-4bit \ --max-tokens 10 \ --prompt "Hello" Expected: Normal text generation Actual: Crash with: libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory) llama.cpp brew install llama.cpp llama-cli --model model.gguf --prompt "Hello" --n-predict 20 --n-gpu-layers 99 Expected: Fast GPU generation Actual: Process hangs indefinitely Test Results Tool Model Peak Memory Result MLX Qwen2.5-0.5B-4bit 0.36 GB ✅ Works MLX Qwen2.5-1.5B-4bit 0.98 GB ✅ Works MLX Qwen3-1.7B-4bit 1.01 GB ✅ Works MLX Qwen2.5-3B-4bit ~1.5 GB ❌ Metal OOM crash MLX Qwen3-4B-4bit ~2.1 GB ❌ Metal OOM crash MLX Qwen3-8B-4bit ~4.5 GB ❌ Metal OOM crash llama.cpp Qwen2.5-0.5B GGUF ~0.5 GB ❌ Hangs with GPU llama.cpp Qwen2.5-0.5B GGUF ~0.5 GB ✅ Works with CPU only Key Evidence Hardware is healthy — Apple Diagnostics passed all tests Basic Metal works — matmul, array ops work fine CPU inference works — llama.cpp with -ngl 0 runs correctly The error is NOT about actual memory exhaustion — kIOGPUCommandBufferCallbackErrorOutOfMemory means the kernel rejects the Metal memory commit, not that physical memory is full. The system reports 17.76GB available for Metal working set. Crash Log Extract Thread 31 Crashed: 0 libsystem_kernel.dylib __pthread_kill + 8 1 libsystem_pthread.dylib pthread_kill + 296 2 libsystem_c.dylib abort + 148 3 Metal MTLReportFailure.cold.1 + 48 4 Metal MTLReportFailure + 576 5 Metal -[_MTLCommandBuffer addCompletedHandler:] + 104 ... Exception Type: EXC_CRASH (SIGABRT) Termination Reason: Namespace SIGNAL, Code 6, Abort trap: 6 Related Issues ml-explore/mlx#3586 — Metal compiler regression on macOS 26.5 ml-explore/mlx#3534 — M5 float32 precision issue ml-explore/mlx#3568 — M5 random divergence ml-explore/mlx#3539 — Metal residency OOM (M4 Max) Request Please investigate the AGXMetalG17X driver for M5 Pro on macOS 26.5. The driver appears to incorrectly reject Metal memory commits for LLM inference workloads, even when the working set is well within the system's reported limits (1.5GB requested vs 17.76GB available). Happy to provide full crash logs, sysdiagnose archives, or run additional tests.

Graphics & Games Metal Metal macOS Apple Silicon metal-cpp

306

May ’26

Inexplicable Metal crash ever since iOS 26.5 beta 4

Hi all, I'm working on updating my audio visualizer app. I'm adding new visualizers based on Metal 4 compute shaders. They worked in iOS 26.4 and iOS 26.5 up until beta 3. However, after that, the visualizers started crashing the phone and forcing a restart. On the latest version of iOS 26.5, the crash is still there. I submitted feedback, but haven't heard anything back just yet. I was wondering if others have faced this same issue, and if there are any workarounds. Here is my repo if you want to look at the code (forgive me if it's sloppy, I'm quite new to graphics programming and Metal): https://github.com/aabagdi/VisualMan/tree/main Thank you!

Graphics & Games Metal Metal MetalKit

1.5k

May ’26

Metal 4 support in iOS simulator

I'm updating our app to support metal 4, but the metal 4 types don't seem to get recognized when targeting simulator. Is it known if metal 4 will be supported in the near future, or am I setting up the app wrong?

Graphics & Games Metal Graphics and Games Simulator

1.4k

May ’26

Using setVertexBytes for index primitives

When using index primitives is there a method to provide the indices using a temp buffer like setVertexBytes? Right now I have to create a temp metal buffer even for a small number of vertices and toss it after rendering using drawIndexedPrimitives.

Graphics & Games Metal

654

May ’26

MTL4FXTemporalDenoisedScaler initialization

I’m trying to use MTL4FXTemporalDenoisedScaler, and I’m seeing a crash during initialization even with a very simple sample app. I created a minimal sample here: https://github.com/tatsuya-ogawa/MetalFXInitExample The exception is: NSException: "-[AGXG16XFamilyHeap baseObject]: unrecognized selector sent to instance ..." What I found is: • This works: descriptor.makeTemporalDenoisedScaler(device: device) • This crashes: descriptor.makeTemporalDenoisedScaler(device: device, compiler: metal4Compiler) So the issue seems to happen only with the Metal4FX version. For testing, I’m using an iPhone 15 Pro. According to the Metal Feature Set Tables, MetalFX denoised upscaling should be supported on Apple9 and later, so I believe the device itself should meet the requirements. Reference: https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf Has anyone seen this before, or knows what might be causing it? I’d appreciate any advice. Thanks.

Graphics & Games Metal MetalFX

562

Apr ’26

Cannot load .mtlpackage to MTLLibrary

After watching WWDC 2025 session "Combine Metal 4 machine learning and graphics", I have decided to give it a shot to integrate the latest MTL4MachineLearningCommandEncoder to my existing render pipeline. After a lot of trial and errors, I managed to set up the pipeline and have the app compiled. However, I am now stuck on creating a MTLLibrary with .mtlpackage. Here is the code I have to create a MTLLibrary according the WWDC session https://developer.apple.com/videos/play/wwdc2025/262/?time=550: let coreMLFilePath = bundle.path(forResource: "my_model", ofType: "mtlpackage")! let coreMLURL = URL(string: coreMLFilePath)! do { metalDevice.makeLibrary(URL: coreMLURL) } catch { print("error: \(error)") } With the above code, I am getting error: Error Domain=MTLLibraryErrorDomain Code=1 "Invalid metal package" UserInfo={NSLocalizedDescription=Invalid metal package} What is the correct way to create a MTLLibrary with .mtlpackage? Do I see this error because the .mtlpackage I am using is incorrect? How should I go with debugging this? I'd really appreciate if I could get some help on this as I have been stuck with it for some time now. Thanks in advance!

Graphics & Games Metal Metal MetalKit

685

Apr ’26

Can a compute pipeline be as efficient as a render pipeline for rasterization?

I'm new to graphics and game design and I just wanted to know if a compute pipeline could be as efficient as a render pipeline for rasterization and an explanation on how and why. Also is it possible to manually perform rasterization with a render pipeline as in manipulate individual pixel data in a metal texture yourself but do it with a render pipeline?

Graphics & Games Metal Graphics and Games Metal 2D Graphics

757

Apr ’26

Question on setVertexBytes

I think if your buffer is less than 4k its recommended to use setVertexBytes, the question I have is can I keep hammering on setVertexBytes as the primary method to issue multiple draw calls within a render buffer and rely on Metal to figure out how to orphan and replace the target buffer? A lot of the primitives I am drawing are less than 4k and the process of wiring down larger segments of memory for individual buffers for each draw primitive call seems to be a negative. And it's just simpler to copy, submit and forget about buffer synchronization.

Graphics & Games Metal

816

Apr ’26

GPTK 3 and D3DMetal issue with Modern Pipeline Creation

Death Stranding 2: On the Beach (v1.0.48.0, Steam) crashes during rendering initialization when running through CrossOver 26 with D3DMetal 3.0 on an Apple M2 Max Mac Studio running macOS Sequoia. The game successfully initializes Streamline, NVAPI, DLSS (Result::eOk), DLSSG (Result::eOk), Reflex, and XeSS — all subsystems report success. The crash occurs immediately after, during rendering pipeline creation, before the game reaches NXStorage initialization or window creation. Minidump analysis confirms the crash is an access violation (0xc0000005) at DS2.exe+0x67233d, writing to address 0x0. RAX=0x0 (null pointer being dereferenced), R12=0xFFFFFFFFFFFFFFFF (error/invalid handle return). The game appears to call a D3D12 API — likely CheckFeatureSupport or a pipeline state creation function — that D3DMetal acknowledges as supported but returns null or invalid data for. The game trusts the response and dereferences the null pointer. Two other Nixxes titles using the same engine and D3DMetal setup run without issue: Spider-Man 2 (~50 FPS) and Horizon Zero Dawn Remastered (~34 FPS). DS2 uses newer technology versions (DLSS 4, FSR 4, XeSS 2) and a newer DirectX 12 Agility SDK, which likely queries D3D12 features that D3DMetal does not yet fully implement. The crash also reproduces when D3DMetal reports as AMD vendor (1002) instead of NVIDIA (10de), crashing at the same executable offset, confirming it is a D3D12 feature reporting gap in D3DMetal rather than a vendor-specific issue. How To Reproduce Install Crossover 26+ on MacOS 26.4 Install Steam and download Death Stranding 2 Run Death Stranding 2 and check logs after crash in Documents\DEATH STRANDING 2 ON THE BEACH Feedback Requests FB22285513 — Game Porting Toolkit 3 issue with Modern Pipeline Creation

Graphics & Games Metal Metal MetalKit Metal Performance Shaders Game Porting Toolkit

945

Apr ’26

Xcode26 Replay frame broken

Got a broken frame when using Xcode to capture a frame and replay it from a Unity game. It seems like the vertex buffer is broken; I see a bunch of "nan"s in the vertex buffer. However, the game displays correct when running, and it only happend when I upgrade my Xcode and iphone to Xcode26 and IOS26 ios26

Graphics & Games Metal Metal Xcode

546

Apr ’26

Background GPU Access availability

Graphics & Games Metal Metal Background Tasks MetalFX

Replies: 5
Boosts: 0
Views: 1.2k
Activity: 5h

Timestamp counter heap always returns zero

Graphics & Games Metal

Replies: 0
Boosts: 0
Views: 73
Activity: 4d

MDLAsset loads texture in usdz file loaded with wrong colorspace

Graphics & Games Metal Graphics and Games Metal MetalKit USDZ

Replies: 4
Boosts: 3
Views: 1k
Activity: 6d

Comprehensive documentation and literature

Graphics & Games Metal

Replies: 1
Boosts: 0
Views: 184
Activity: 2w

Documentation and literature

Graphics & Games Metal

Replies: 0
Boosts: 0
Views: 151
Activity: 2w

Performance Optimization for Large-Kernel Image Processing

Graphics & Games Metal

Replies: 1
Boosts: 0
Views: 181
Activity: 2w

Opportunities to use Apple intelligence.

Are there opportunities for developers to use Apple Intelligence models through Metal in ways that unlock new rendering, simulation, or real-time content generation techniques?

Graphics & Games Metal

Replies: 1
Boosts: 0
Views: 173
Activity: 2w

Memory allocation of textures in Metal

Graphics & Games Metal

Replies: 1
Boosts: 1
Views: 191
Activity: 2w

MetalFX upscaler/denoiser and instant changes

Graphics & Games Metal MetalFX

Replies: 3
Boosts: 0
Views: 440
Activity: 2w

metal shader converter library distribution

The documentation is unclear. I need a clarification on metal shader converter library distribution. Am I allowed to distribute the library as a part of my macos app bundle?

Graphics & Games Metal Metal Shader Converter