Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Greetings, and Happy Holidays, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device. parcri.net has the link :)
1
0
501
Dec ’25
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
0
0
135
Jul ’25
Best practices for designing proactive FinTech insights with App Intents & Shortcuts?
Hello fellow developers, I'm the founder of a FinTech startup, Cent Capital (https://cent.capital), where we are building an AI-powered financial co-pilot. We're deeply exploring the Apple ecosystem to create a more proactive and ambient user experience. A core part of our vision is to use App Intents and the Shortcuts app to surface personalized financial insights without the user always needing to open our app. For example, suggesting a Shortcut like, "What's my spending in the 'Dining Out' category this month?" or having an App Intent proactively surface an insight like, "Your 'Subscriptions' budget is almost full." My question for the community is about the architectural and user experience best practices for this. How are you thinking about the balance between providing rich, actionable insights via Intents without being overly intrusive or "spammy" to the user? What are the best practices for designing the data model that backs these App Intents for a complex domain like personal finance? Are there specific performance or privacy considerations we should be aware of when surfacing potentially sensitive financial data through these system-level integrations? We believe this is the future of FinTech apps on iOS and would love to hear how other developers are thinking about this challenge. Thanks for your insights!
0
0
334
Oct ’25
Building a 4-agent autonomous coding pipeline on Apple Silicon — MLX backend questions
Hi, I'm building ANF (Autonomous Native Forge) — a cloud-free, 4-agent autonomous software production pipeline running on local hardware with local LLM inference. No middleware, pure Node.js native. Currently running on NVIDIA Blackwell GB10 with vLLM + DeepSeek-R1-32B. Now porting to Apple Silicon. Three technical questions: How production-ready is mlx-lm's OpenAI-compatible API server for long context generation (32K tokens)? What's the recommended approach for KV Cache management with Unified Memory architecture — any specific flags or configurations for M4 Ultra? MLX vs GGUF (llama.cpp) for a multi-agent pipeline where 4 agents call the inference endpoint concurrently — which handles parallel requests better on Apple Silicon? GitHub: github.com/trgysvc/AutonomousNativeForge Any guidance appreciated.
0
0
225
6d
Hardware Support for Low Precision Data Types?
Hi all, I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision! Thanks in advance.
0
0
312
Nov ’25
How can I change the output dimensions of a CoreML model in Xcode when the outputs come from a NonMaximumSuppression layer?
After exerting a custom model with nms=True. In Xcode, the outputs show as: confidence: MultiArray (0 × 5) coordinates: MultiArray (0 × 4) I want to set fixed shapes (e.g., 100 × 5, 100 × 4), but Xcode does not allow editing—the shape fields are locked. The model graph shows both outputs come directly from a NonMaximumSuppression layer. Is it possible to set fixed output dimensions for NMS outputs in CoreML?
2
0
213
2w
SpeechTranscriber time indexes - detect pauses?
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing! I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause. I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough. Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber? Here's a short sample, showing each run with the start, end, and characters in the run: 105.9 --> 107.04 I 107.04 --> 107.16 think 107.16 --> 108.0 more 108.0 --> 108.42 lighting 108.42 --> 108.6 is 108.6 --> 108.72 definitely 108.72 --> 109.2 needed, 109.2 --> 109.92 downtown. 109.98 --> 110.4 My 110.4 --> 110.52 only 110.52 --> 110.7 question 110.7 --> 111.06 is, 111.06 --> 111.48 poll 111.48 --> 111.78 five, 111.78 --> 111.84 that 111.84 --> 112.08 you're 112.08 --> 112.38 increasing 112.38 --> 112.5 the 112.5 --> 113.34 50,000? 113.4 --> 113.58 Where 113.58 --> 113.88 exactly
0
0
255
Jun ’25
ILMessageFilterExtension memory limit
I’m considering creating an ILMessageFilterExtension using a mini LLM/SLM to detect fraud and I’ve read it has strict memory limits yet I can’t find it in the documentation. What’s the set limit or any other constraints impacting the feasibility of running 100-500mb model?
0
0
80
Apr ’25
ImagePlayground: Programmatic Creation Error
Hardware: Macbook Pro M4 Nov 2024 Software: macOS Tahoe 26.0 & xcode 26.0 Apple Intelligence is activated and the Image playground macOS app works Running the following on xcode throws ImagePlayground.ImageCreator.Error.creationFailed Any suggestions on how to make this work? import Foundation import ImagePlayground Task { let creator = try await ImageCreator() guard let style = creator.availableStyles.first else { print("No styles available") exit(1) } let images = creator.images( for: [.text("A cat wearing mittens.")], style: style, limit: 1) for try await image in images { print("Generated image: \(image)") } exit(0) } RunLoop.main.run()
0
0
329
Sep ’25
AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
0
0
325
Aug ’25
Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
I've created the following Foundation Models Tool, which uses the .anyOf guide to constrain the LLM's generation of suitable input arguments. When calling the tool, the model is only allowed to request one of a fixed set of sections, as defined in the sections array. struct SectionReader: Tool { let article: Article let sections: [String] let name: String = "readSection" let description: String = "Read a specific section from the article." var parameters: GenerationSchema { GenerationSchema( type: GeneratedContent.self, properties: [ GenerationSchema.Property( name: "section", description: "The article section to access.", type: String.self, guides: [.anyOf(sections)] ) ] ) } func call(arguments: GeneratedContent) async throws -> String { let requestedSectionName = try arguments.value(String.self, forProperty: "section") ... } } However, I have found that the model will sometimes call the tool with invalid (but plausible) section names, meaning that .anyOf is not actually doing its job (i.e. requestedSectionName is sometimes not a member of sections). The documentation for the .anyOf guide says, "Enforces that the string be one of the provided values." Is this a bug or have I made a mistake somewhere? Many thanks for any help you provide!
11
0
851
Jan ’26
Code along with the Foundation Models framework
In this online session, you can code along with us as we build generative AI features into a sample app live in Xcode. We'll guide you through implementing core features like basic text generation, as well as advanced topics like guided generation for structured data output, streaming responses for dynamic UI updates, and tool calling to retrieve data or take an action. Check out these resources to get started: Download the project files: https://developer.apple.com/events/re... Explore the code along guide: https://developer.apple.com/events/re... Join the live Q&A: https://developer.apple.com/videos/pl... Agenda – All times PDT 10 a.m.: Welcome and Xcode setup 10:15 a.m.: Framework basics, guided generation, and building prompts 11 a.m.: Break 11:10 a.m.: UI streaming, tool calling, and performance optimization 11:50 a.m.: Wrap up All are welcome to attend the session. To actively code along, you'll need a Mac with Apple silicon that supports Apple Intelligence running the latest release of macOS Tahoe 26 and Xcode 26. If you have questions after the code along concludes please share a post here in the forums and engage with the community.
0
0
297
Sep ’25
Vision Framework VNTrackObjectRequest: Minimum Valid Bounding Box Size Causing Internal Error (Code=9)
I'm developing a tennis ball tracking feature using Vision Framework in Swift, specifically utilizing VNDetectedObjectObservation and VNTrackObjectRequest. Occasionally (but not always), I receive the following runtime error: Failed to perform SequenceRequest: Error Domain=com.apple.Vision Code=9 "Internal error: unexpected tracked object bounding box size" UserInfo={NSLocalizedDescription=Internal error: unexpected tracked object bounding box size} From my investigation, I suspect the issue arises when the bounding box from the initial observation (VNDetectedObjectObservation) is too small. However, Apple's documentation doesn't clearly define the minimum bounding box size that's considered valid by VNTrackObjectRequest. Could someone clarify: What is the minimum acceptable bounding box width and height (normalized) that Vision Framework's VNTrackObjectRequest expects? Is there any recommended practice or official guidance for bounding box size validation before creating a tracking request? This information would be extremely helpful to reliably avoid this internal error. Thank you!
0
0
133
Apr ’25
Subject: Technical Report: Float32 Precision Ceiling & Memory Fragmentation in JAX/Metal Workloads on M3
Subject: Technical Report: Float32 Precision Ceiling & Memory Fragmentation in JAX/Metal Workloads on M3 To: Metal Developer Relations Hello, I am reporting a repeatable numerical saturation point encountered during sustained recursive high-order differential workloads on the Apple M3 (16 GB unified memory) using the JAX Metal backend. Workload Characteristics: Large-scale vector projections across multi-dimensional industrial datasets Repeated high-order finite-difference calculations Heavy use of jax.grad and lax.cond inside long-running loops Observation: Under these conditions, the Metal/MPS backend consistently enters a terminal quantization lock where outputs saturate at a fixed scalar value (2.0000), followed by system-wide NaN propagation. This appears to be a precision-limited boundary in the JAX-Metal bridge when handling high-order operations with cubic time-scale denominators. have identified the specific threshold where recursive high-order tensor derivatives exceed the numerical resolution of 32-bit consumer architectures, necessitating a migration to a dedicated 64-bit industrial stack. I have prepared a minimal synthetic test script (randomized vectors only, no proprietary logic) that reliably reproduces the allocator fragmentation and saturation behavior. Let me know if your team would like the telemetry for XLA/MPS optimization purposes. Best regards, Alex Severson Architect, QuantumPulse AI
0
0
200
2w
Core Image for depth maps & segmentation masks: numeric fidelity issues when rendering CIImage to CVPixelBuffer (looking for Architecture suggestions)
Hello All, I’m working on a computer-vision–heavy iOS application that uses the camera, LiDAR depth maps, and semantic segmentation to reason about the environment (object identification, localization and measurement - not just visualization). Current architecture I initially built the image pipeline around CIImage as a unifying abstraction. It seemed like a good idea because: CIImage integrates cleanly with Vision, ARKit, AVFoundation, Metal, Core Graphics, etc. It provides a rich set of out-of-the-box transforms and filters. It is immutable and thread-safe, which significantly simplified concurrency in a multi-queue pipeline. The LiDAR depth maps, semantic segmentation masks, etc. were treated as CIImages, with conversion to CVPixelBuffer or MTLTexture only at the edges when required. Problem I’ve run into cases where Core Image transformations do not preserve numeric fidelity for non-visual data. Example: Rendering a CIImage-backed segmentation mask into a larger CVPixelBuffer can cause label values to change in predictable but incorrect ways. This occurs even when: using nearest-neighbor sampling disabling color management (workingColorSpace / outputColorSpace = NSNull) applying identity or simple affine transforms I’ve confirmed via controlled tests that: Metal → CVPixelBuffer paths preserve values correctly CIImage → CVPixelBuffer paths can introduce value changes when resampling or expanding the render target This makes CIImage unsafe as a source of numeric truth for segmentation masks and depth-based logic, even though it works well for visualization, and I should have realized this much sooner. Direction I’m considering I’m now considering refactoring toward more intent-based abstractions instead of a single image type, for example: Visual images: CIImage (camera frames, overlays, debugging, UI) Scalar fields: depth / confidence maps backed by CVPixelBuffer + Metal Label maps: segmentation masks backed by integer-preserving buffers (no interpolation, no transforms) In this model, CIImage would still be used extensively — but primarily for visualization and perceptual processing, not as the container for numerically sensitive data. Thread safety concern One of the original advantages of CIImage was that it is thread-safe by design, and that was my biggest incentive. For CVPixelBuffer / MTLTexture–backed data, I’m considering enforcing thread safety explicitly via: Swift Concurrency (actor-owned data, explicit ownership) Questions For those may have experience with CV / AR / imaging-heavy iOS apps, I was hoping to know the following: Is this separation of image intent (visual vs numeric vs categorical) a reasonable architectural direction? Do you generally keep CIImage at the heart of your pipeline, or push it to the edges (visualization only)? How do you manage thread safety and ownership when working heavily with CVPixelBuffer and Metal? Using actor-based abstractions, GCD, or adhoc? Are there any best practices or gotchas around using Core Image with depth maps or segmentation masks that I should be aware of? I’d really appreciate any guidance or experience-based advice. I suspect I’ve hit a boundary of Core Image’s design, and I’m trying to refactor in a way that doesn't involve too much immediate tech debt, remains robust and maintainable long-term. Thank you in advance!
2
0
356
3w
Context window 90% of adapter model full after single user prompt
I have been able to train an adapter on Google's Colaboratory. I am able to start a LanguageModelSession and load it with my adapter. The problem is that after one simple prompt, the context window is 90% full. If I start the session without the adapter, the same simple prompt consumes only 1% of the context window. Has anyone encountered this? I asked Claude AI and it seems to think that my training script needs adjusting. Grok on the other hand is (wrongly, I tried) convinced that I just need to tweak some parameters of LanguageModelSession or SystemLanguageModel. Thanks for any tips.
13
0
3.2k
Feb ’26
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Greetings, and Happy Holidays, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device. parcri.net has the link :)
Replies
1
Boosts
0
Views
501
Activity
Dec ’25
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
Replies
0
Boosts
0
Views
135
Activity
Jul ’25
MPS Kernel and Sparse Matrix
hello, Do you have any information on the handling of sparse matrix with MPS and PyTorch? release date? ...
Replies
0
Boosts
0
Views
493
Activity
Dec ’25
Unpredictable performance when using structured output
Hey, When generating responses with structured output and non-streaming API, it sometimes takes 3s, sometimes 10-20s. I am firing that request subsequently while testing the app. Is this by design, or any place I can learn more about what contributes to such variation?
Replies
1
Boosts
0
Views
223
Activity
Jul ’25
face and body detection is local model or a cloud model?
Is the face and body detection service in the Vision framework a local model or a cloud model? https://developer.apple.com/documentation/vision
Replies
1
Boosts
0
Views
746
Activity
Sep ’25
Supported regex patterns for generation guide
Hey Tried using a few regular expressions and all fail with an error: Unhandled error streaming response: A generation guide with an unsupported pattern was used. Is there are a list of supported features? I don't see it in docs, and it takes RegExp. Anything with e.g. [A-Z] fails.
Replies
1
Boosts
0
Views
151
Activity
Jul ’25
Best practices for designing proactive FinTech insights with App Intents & Shortcuts?
Hello fellow developers, I'm the founder of a FinTech startup, Cent Capital (https://cent.capital), where we are building an AI-powered financial co-pilot. We're deeply exploring the Apple ecosystem to create a more proactive and ambient user experience. A core part of our vision is to use App Intents and the Shortcuts app to surface personalized financial insights without the user always needing to open our app. For example, suggesting a Shortcut like, "What's my spending in the 'Dining Out' category this month?" or having an App Intent proactively surface an insight like, "Your 'Subscriptions' budget is almost full." My question for the community is about the architectural and user experience best practices for this. How are you thinking about the balance between providing rich, actionable insights via Intents without being overly intrusive or "spammy" to the user? What are the best practices for designing the data model that backs these App Intents for a complex domain like personal finance? Are there specific performance or privacy considerations we should be aware of when surfacing potentially sensitive financial data through these system-level integrations? We believe this is the future of FinTech apps on iOS and would love to hear how other developers are thinking about this challenge. Thanks for your insights!
Replies
0
Boosts
0
Views
334
Activity
Oct ’25
Building a 4-agent autonomous coding pipeline on Apple Silicon — MLX backend questions
Hi, I'm building ANF (Autonomous Native Forge) — a cloud-free, 4-agent autonomous software production pipeline running on local hardware with local LLM inference. No middleware, pure Node.js native. Currently running on NVIDIA Blackwell GB10 with vLLM + DeepSeek-R1-32B. Now porting to Apple Silicon. Three technical questions: How production-ready is mlx-lm's OpenAI-compatible API server for long context generation (32K tokens)? What's the recommended approach for KV Cache management with Unified Memory architecture — any specific flags or configurations for M4 Ultra? MLX vs GGUF (llama.cpp) for a multi-agent pipeline where 4 agents call the inference endpoint concurrently — which handles parallel requests better on Apple Silicon? GitHub: github.com/trgysvc/AutonomousNativeForge Any guidance appreciated.
Replies
0
Boosts
0
Views
225
Activity
6d
Hardware Support for Low Precision Data Types?
Hi all, I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision! Thanks in advance.
Replies
0
Boosts
0
Views
312
Activity
Nov ’25
How can I change the output dimensions of a CoreML model in Xcode when the outputs come from a NonMaximumSuppression layer?
After exerting a custom model with nms=True. In Xcode, the outputs show as: confidence: MultiArray (0 × 5) coordinates: MultiArray (0 × 4) I want to set fixed shapes (e.g., 100 × 5, 100 × 4), but Xcode does not allow editing—the shape fields are locked. The model graph shows both outputs come directly from a NonMaximumSuppression layer. Is it possible to set fixed output dimensions for NMS outputs in CoreML?
Replies
2
Boosts
0
Views
213
Activity
2w
SpeechTranscriber time indexes - detect pauses?
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing! I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause. I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough. Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber? Here's a short sample, showing each run with the start, end, and characters in the run: 105.9 --> 107.04 I 107.04 --> 107.16 think 107.16 --> 108.0 more 108.0 --> 108.42 lighting 108.42 --> 108.6 is 108.6 --> 108.72 definitely 108.72 --> 109.2 needed, 109.2 --> 109.92 downtown. 109.98 --> 110.4 My 110.4 --> 110.52 only 110.52 --> 110.7 question 110.7 --> 111.06 is, 111.06 --> 111.48 poll 111.48 --> 111.78 five, 111.78 --> 111.84 that 111.84 --> 112.08 you're 112.08 --> 112.38 increasing 112.38 --> 112.5 the 112.5 --> 113.34 50,000? 113.4 --> 113.58 Where 113.58 --> 113.88 exactly
Replies
0
Boosts
0
Views
255
Activity
Jun ’25
ILMessageFilterExtension memory limit
I’m considering creating an ILMessageFilterExtension using a mini LLM/SLM to detect fraud and I’ve read it has strict memory limits yet I can’t find it in the documentation. What’s the set limit or any other constraints impacting the feasibility of running 100-500mb model?
Replies
0
Boosts
0
Views
80
Activity
Apr ’25
ImagePlayground: Programmatic Creation Error
Hardware: Macbook Pro M4 Nov 2024 Software: macOS Tahoe 26.0 & xcode 26.0 Apple Intelligence is activated and the Image playground macOS app works Running the following on xcode throws ImagePlayground.ImageCreator.Error.creationFailed Any suggestions on how to make this work? import Foundation import ImagePlayground Task { let creator = try await ImageCreator() guard let style = creator.availableStyles.first else { print("No styles available") exit(1) } let images = creator.images( for: [.text("A cat wearing mittens.")], style: style, limit: 1) for try await image in images { print("Generated image: \(image)") } exit(0) } RunLoop.main.run()
Replies
0
Boosts
0
Views
329
Activity
Sep ’25
AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
Replies
0
Boosts
0
Views
325
Activity
Aug ’25
Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
I've created the following Foundation Models Tool, which uses the .anyOf guide to constrain the LLM's generation of suitable input arguments. When calling the tool, the model is only allowed to request one of a fixed set of sections, as defined in the sections array. struct SectionReader: Tool { let article: Article let sections: [String] let name: String = "readSection" let description: String = "Read a specific section from the article." var parameters: GenerationSchema { GenerationSchema( type: GeneratedContent.self, properties: [ GenerationSchema.Property( name: "section", description: "The article section to access.", type: String.self, guides: [.anyOf(sections)] ) ] ) } func call(arguments: GeneratedContent) async throws -> String { let requestedSectionName = try arguments.value(String.self, forProperty: "section") ... } } However, I have found that the model will sometimes call the tool with invalid (but plausible) section names, meaning that .anyOf is not actually doing its job (i.e. requestedSectionName is sometimes not a member of sections). The documentation for the .anyOf guide says, "Enforces that the string be one of the provided values." Is this a bug or have I made a mistake somewhere? Many thanks for any help you provide!
Replies
11
Boosts
0
Views
851
Activity
Jan ’26
Code along with the Foundation Models framework
In this online session, you can code along with us as we build generative AI features into a sample app live in Xcode. We'll guide you through implementing core features like basic text generation, as well as advanced topics like guided generation for structured data output, streaming responses for dynamic UI updates, and tool calling to retrieve data or take an action. Check out these resources to get started: Download the project files: https://developer.apple.com/events/re... Explore the code along guide: https://developer.apple.com/events/re... Join the live Q&A: https://developer.apple.com/videos/pl... Agenda – All times PDT 10 a.m.: Welcome and Xcode setup 10:15 a.m.: Framework basics, guided generation, and building prompts 11 a.m.: Break 11:10 a.m.: UI streaming, tool calling, and performance optimization 11:50 a.m.: Wrap up All are welcome to attend the session. To actively code along, you'll need a Mac with Apple silicon that supports Apple Intelligence running the latest release of macOS Tahoe 26 and Xcode 26. If you have questions after the code along concludes please share a post here in the forums and engage with the community.
Replies
0
Boosts
0
Views
297
Activity
Sep ’25
Vision Framework VNTrackObjectRequest: Minimum Valid Bounding Box Size Causing Internal Error (Code=9)
I'm developing a tennis ball tracking feature using Vision Framework in Swift, specifically utilizing VNDetectedObjectObservation and VNTrackObjectRequest. Occasionally (but not always), I receive the following runtime error: Failed to perform SequenceRequest: Error Domain=com.apple.Vision Code=9 "Internal error: unexpected tracked object bounding box size" UserInfo={NSLocalizedDescription=Internal error: unexpected tracked object bounding box size} From my investigation, I suspect the issue arises when the bounding box from the initial observation (VNDetectedObjectObservation) is too small. However, Apple's documentation doesn't clearly define the minimum bounding box size that's considered valid by VNTrackObjectRequest. Could someone clarify: What is the minimum acceptable bounding box width and height (normalized) that Vision Framework's VNTrackObjectRequest expects? Is there any recommended practice or official guidance for bounding box size validation before creating a tracking request? This information would be extremely helpful to reliably avoid this internal error. Thank you!
Replies
0
Boosts
0
Views
133
Activity
Apr ’25
Subject: Technical Report: Float32 Precision Ceiling & Memory Fragmentation in JAX/Metal Workloads on M3
Subject: Technical Report: Float32 Precision Ceiling & Memory Fragmentation in JAX/Metal Workloads on M3 To: Metal Developer Relations Hello, I am reporting a repeatable numerical saturation point encountered during sustained recursive high-order differential workloads on the Apple M3 (16 GB unified memory) using the JAX Metal backend. Workload Characteristics: Large-scale vector projections across multi-dimensional industrial datasets Repeated high-order finite-difference calculations Heavy use of jax.grad and lax.cond inside long-running loops Observation: Under these conditions, the Metal/MPS backend consistently enters a terminal quantization lock where outputs saturate at a fixed scalar value (2.0000), followed by system-wide NaN propagation. This appears to be a precision-limited boundary in the JAX-Metal bridge when handling high-order operations with cubic time-scale denominators. have identified the specific threshold where recursive high-order tensor derivatives exceed the numerical resolution of 32-bit consumer architectures, necessitating a migration to a dedicated 64-bit industrial stack. I have prepared a minimal synthetic test script (randomized vectors only, no proprietary logic) that reliably reproduces the allocator fragmentation and saturation behavior. Let me know if your team would like the telemetry for XLA/MPS optimization purposes. Best regards, Alex Severson Architect, QuantumPulse AI
Replies
0
Boosts
0
Views
200
Activity
2w
Core Image for depth maps & segmentation masks: numeric fidelity issues when rendering CIImage to CVPixelBuffer (looking for Architecture suggestions)
Hello All, I’m working on a computer-vision–heavy iOS application that uses the camera, LiDAR depth maps, and semantic segmentation to reason about the environment (object identification, localization and measurement - not just visualization). Current architecture I initially built the image pipeline around CIImage as a unifying abstraction. It seemed like a good idea because: CIImage integrates cleanly with Vision, ARKit, AVFoundation, Metal, Core Graphics, etc. It provides a rich set of out-of-the-box transforms and filters. It is immutable and thread-safe, which significantly simplified concurrency in a multi-queue pipeline. The LiDAR depth maps, semantic segmentation masks, etc. were treated as CIImages, with conversion to CVPixelBuffer or MTLTexture only at the edges when required. Problem I’ve run into cases where Core Image transformations do not preserve numeric fidelity for non-visual data. Example: Rendering a CIImage-backed segmentation mask into a larger CVPixelBuffer can cause label values to change in predictable but incorrect ways. This occurs even when: using nearest-neighbor sampling disabling color management (workingColorSpace / outputColorSpace = NSNull) applying identity or simple affine transforms I’ve confirmed via controlled tests that: Metal → CVPixelBuffer paths preserve values correctly CIImage → CVPixelBuffer paths can introduce value changes when resampling or expanding the render target This makes CIImage unsafe as a source of numeric truth for segmentation masks and depth-based logic, even though it works well for visualization, and I should have realized this much sooner. Direction I’m considering I’m now considering refactoring toward more intent-based abstractions instead of a single image type, for example: Visual images: CIImage (camera frames, overlays, debugging, UI) Scalar fields: depth / confidence maps backed by CVPixelBuffer + Metal Label maps: segmentation masks backed by integer-preserving buffers (no interpolation, no transforms) In this model, CIImage would still be used extensively — but primarily for visualization and perceptual processing, not as the container for numerically sensitive data. Thread safety concern One of the original advantages of CIImage was that it is thread-safe by design, and that was my biggest incentive. For CVPixelBuffer / MTLTexture–backed data, I’m considering enforcing thread safety explicitly via: Swift Concurrency (actor-owned data, explicit ownership) Questions For those may have experience with CV / AR / imaging-heavy iOS apps, I was hoping to know the following: Is this separation of image intent (visual vs numeric vs categorical) a reasonable architectural direction? Do you generally keep CIImage at the heart of your pipeline, or push it to the edges (visualization only)? How do you manage thread safety and ownership when working heavily with CVPixelBuffer and Metal? Using actor-based abstractions, GCD, or adhoc? Are there any best practices or gotchas around using Core Image with depth maps or segmentation masks that I should be aware of? I’d really appreciate any guidance or experience-based advice. I suspect I’ve hit a boundary of Core Image’s design, and I’m trying to refactor in a way that doesn't involve too much immediate tech debt, remains robust and maintainable long-term. Thank you in advance!
Replies
2
Boosts
0
Views
356
Activity
3w
Context window 90% of adapter model full after single user prompt
I have been able to train an adapter on Google's Colaboratory. I am able to start a LanguageModelSession and load it with my adapter. The problem is that after one simple prompt, the context window is 90% full. If I start the session without the adapter, the same simple prompt consumes only 1% of the context window. Has anyone encountered this? I asked Claude AI and it seems to think that my training script needs adjusting. Grok on the other hand is (wrongly, I tried) convinced that I just need to tweak some parameters of LanguageModelSession or SystemLanguageModel. Thanks for any tips.
Replies
13
Boosts
0
Views
3.2k
Activity
Feb ’26