We're developing a VisionOS application, where we would like to do product recognition (like food items).
We have enterprise entitlements and therefore also main camera access for VisionOS. We send this live camera frames to a trained CoreML model where we will receive 2D coordinates from the model detection prediction.
Now, we would like to create a 3D anchor on the detected items so it can be visible for user. The 3D anchor is going to be the class name of the detected item.
How do we transform this 2D coordinate from the model prediction to a 3D anchor?
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I want to implement the functions in this video, how should I set the window
my coworkers and i are guessing at what data defines an anchor. i tried searching but struggled to find anything helpful.
our best guess was a combination Triangular Irregular Networks (TIN), gps, magnetic compass direction and maybe elevation sensors.
is this documented anywhere? if not, can a definition or description be provided?
I have been concentrating on developing the visionOS application. While I am currently quite familiar with RealityKit, CompositorServices has also captured my attention. I have not yet acquired knowledge of CompositorServices. Could you please clarify whether it is essential for me to learn CompositorServices? Additionally, I would appreciate it if you could provide insights into the advantages of RealityKit and CompositorServices.
Hello,
I'm unable to activate a timeline in my application through an OnTap, OnAddedToScene or OnNotification.
In RCP I can test and play the timelines easily.
When running in the simulator or on device the timelines simply do not run, regardless of the method through which I try to call the API.
I have two questions:
How can I check that my timelines are in my RCP project that's loaded into the scene? I don't see timelines in the entity hierarchy when I debug in RealityKit Debugger
Is Behaviors a component I can manually set at runtime? I can very clearly see the behaviors component attached to my entity in RCP, but when running this code:
.gesture(
TapGesture()
.targetedToAnyEntity()
.onEnded { value in
if value.entity.applyTapForBehaviors() {
print("Success!")
} else {
print("Failure.")
}
}
)
It prints "Failure." every time indicating to me that the entity does not have a Behavior attached to it (whether that's a component or however else the Behavior is associated with the entity)
I also have not had success using the Notification system or even the OnAddedToScene behavior trigger which should theoretically work if a behavior is attached to the entity which the tap experiment indicates it's not.
For context this is my notification trigger code:
private let notificationTrigger = NotificationCenter.default
.publisher(for: Notification.Name("RealityKit.NotificationTrigger"))
@Environment(\.realityKitScene) var scene
Attachment(id: "home") {
Button {
NotificationCenter.default.post(
name: NSNotification.Name("RealityKit.NotificationTrigger"),
object: nil,
userInfo: [
"RealityKit.NotificationTrigger.Scene": scene,
"RealityKit.NotificationTrigger.Identifier": "test"
]
)
} label: {
Text("Test")
}
.padding(20)
.glassBackgroundEffect()
}
.onReceive(notificationTrigger) { _ in
print("test notification received")
I am receiving "test notification received" print statements as well.
I'm using Xcode 16.0 with VisionOS 2.0 on MacOS 15.3.1
Hi All,
We're a studio building an app and as part of a scene we have a 3D asset with a smoke particle emitter and a curved mesh that plays video. I notice that when the video alone is played or the particle effect alone is done then the scene works fine but the frame rate drops drastically when both are turned on.
How do I solve this because this is an important storytelling feature.
I want to open the control center in Vision Pro’s Xcode simulator. Can I open it? If I can, please tell me how to do it. Thank you.
Hi there,
I'm developing a visionOS app that is using the anchor points and mesh from SceneReconstructionProvider anchor updates. I load an ImmersiveSpace using a RealityView and apply a ShaderGraphMaterial (from a Shader Graph in Reality Composer Pro) to the mesh and use calls to setParameter to dynamically update the material on very rapid frequency. The mesh is locked (no more updates) before the calls to setParameter. This process works for a few minutes but then eventually I get the following error in the console:
assertion failure: Index out of range (operator[]:line 789) index = 13662, max = 1
With the following stack trace:
Thread 1 Queue : com.apple.main-thread (serial)
#0 0x00000002880f90d0 in __abort_with_payload ()
#1 0x000000028812a6dc in abort_with_payload_wrapper_internal ()
#2 0x000000028812a710 in abort_with_payload ()
#3 0x0000000288003f40 in _os_crash_msg ()
#4 0x00000001dc9ff624 in re::ecs2::ComponentBucketsBase::addComponent ()
#5 0x00000001dc9ffadc in re::ecs2::ComponentBucketsBase::moveComponent ()
#6 0x00000001dc8b0278 in re::ecs2::MaterialParameterBlockArrayComponentStateImpl::processPreparingComponents ()
#7 0x00000001dc8b05e4 in re::ecs2::MaterialParameterBlockArraySystem::update ()
#8 0x00000001dd008744 in re::Scheduler::executePhase ()
#9 0x00000001dc032ec4 in re::Engine::executePhase ()
#10 0x0000000248121898 in RCPSharedSimulationExecuteUpdate ()
#11 0x00000002264e488c in __59-[MRUISharedSimulation _doJoinWithConnectionContext:error:]_block_invoke.44 ()
#12 0x0000000268c5fe9c in _UIUpdateSequenceRunNext ()
#13 0x00000002696ea540 in schedulerStepScheduledMainSectionContinue ()
#14 0x000000026af8d284 in UC::DriverCore::continueProcessing ()
#15 0x00000001a1bd4e6c in CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION ()
#16 0x00000001a1bd4db0 in __CFRunLoopDoSource0 ()
#17 0x00000001a1bd44f0 in __CFRunLoopDoSources0 ()
#18 0x00000001a1bd3640 in __CFRunLoopRun ()
#19 0x00000001a1bce284 in _CFRunLoopRunSpecificWithOptions ()
#20 0x00000001eff12d2c in GSEventRunModal ()
#21 0x00000002697de878 in -[UIApplication _run] ()
#22 0x00000002697e33c0 in UIApplicationMain ()
#23 0x00000001b56651e4 in closure #1 (Swift.UnsafeMutablePointer<Swift.Optional<Swift.UnsafeMutablePointer<Swift.Int8>>>) -> Swift.Never in SwiftUI.KitRendererCommon(Swift.AnyObject.Type) -> Swift.Never ()
#24 0x00000001b5664f08 in SwiftUI.runApp<τ_0_0 where τ_0_0: SwiftUI.App>(τ_0_0) -> Swift.Never ()
#25 0x00000001b53ad570 in static SwiftUI.App.main() -> () ()
#26 0x0000000101bc7b9c in static MetalRendererApp.$main() ()
#27 0x0000000101bc7bdc in main ()
#28 0x0000000197fd0284 in start ()
Any advice on how to solve this or prevent the error?
Thanks!
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
Tags:
Reality Composer Pro
Shader Graph Editor
I tried to show spatial photo on my application by swiftUI's Image but it just show flat version of it even I Use Vision Pro,
so, how can I show spatial photo to users,
does there any options for this?
Hey there,
since SceneView has been marked as „deprecated“ for SwiftUI, I‘m wondering which alternatives should be considered for the following situation:
I have a SwiftUI app (for iOS and iPadOS) where users can view (with rotate, scale, move gestures) 3D models (USDZ) in a scene. The models will be downloaded from web backend and called via local URL paths.
What I tested:
I‘ve tried ARView in .nonAR mode, RealityView, however I didn‘t get the expected response -> User can rotate, scale the 3D models in a virtual space.
ARView in nonAR mode still shows the object like in normal AR mode without camera stream.
I tried to add Gestures to the RealityView on iOS - loading USDZ 3D models worked but the gestures didn’t).
Model3D is only available for visionOS (that would be amazing to have it for iOS)
I also checked QuickLook Preview however it works pretty strange via Filepicker etc, which is not the way how the user should load the 3D models in my app.
Maybe I missed something, I couldn’t find anything which can help me. I‘m pretty much stucked adopting the latest and greatest frameworks/APIs in my App and taking the next steps porting my app to visionOS.
Long story short 😃:
Does someone have an idea what is the alternative to SceneView for USDZ 3D models?
I appreciate your support!!
Thanks in advance!
I found some snapshot API in developer documents, like blows:
RealityKit / Views and attachments / ARView / /snapshot(saveToHDR:completion:)
SceneKit / SCNView / snapshot()
Is there a similar API in visionOS?and if not, how can I implement snapshot for realityview and usdz?
SpatialEventGesture Not Working to Show Hidden Menu in Immersive Panorama View - visionOS
Problem Description
I'm developing a Vision Pro app that displays 360° panoramic photos in a full immersive space. I have a floating menu that auto-hides after 5 seconds, and I want users to be able to show the menu again using spatial gestures (particularly pinch gestures) when it's hidden.
However, the SpatialEventGesture implementation is not working as expected. The menu doesn't appear when users perform pinch gestures or other spatial interactions in the immersive space.
Current Implementation
Here's the relevant gesture detection code in my ImmersiveView:
import SwiftUI
import RealityKit
struct ImmersiveView: View {
@EnvironmentObject var appModel: AppModel
@Environment(\.openWindow) private var openWindow
var body: some View {
RealityView { content in
// RealityView content setup with panoramic sphere...
let rootEntity = Entity()
content.add(rootEntity)
// Load panoramic content here...
}
// Using SpatialEventGesture to handle multiple spatial gestures
.gesture(
SpatialEventGesture()
.onEnded { eventCollection in
// Check menu visibility state
if !appModel.isPanoramaMenuVisible {
// Iterate through event collection to handle various gestures
for event in eventCollection {
switch event.kind {
case .touch:
print("Detected spatial touch gesture, showing menu")
showMenuWithGesture()
return
case .indirectPinch:
print("Detected spatial pinch gesture, showing menu")
showMenuWithGesture()
return
case .pointer:
print("Detected spatial pointer gesture, showing menu")
showMenuWithGesture()
return
@unknown default:
print("Detected unknown spatial gesture: \(event.kind)")
showMenuWithGesture()
return
}
}
}
}
)
// Keep long press gesture as backup
.simultaneousGesture(
LongPressGesture(minimumDuration: 1.5)
.onEnded { _ in
if !appModel.isPanoramaMenuVisible {
print("Detected long press gesture, showing menu")
showMenuWithGesture()
}
}
)
}
private func showMenuWithGesture() {
if !appModel.isPanoramaMenuVisible {
appModel.showPanoramaMenu()
if !appModel.windowExists(id: "PanoramaMenu") {
openWindow(id: "PanoramaMenu", value: "menu")
}
}
}
}
What I've Tried
Multiple SpatialTapGesture approaches: Originally tried using multiple .gesture() modifiers with SpatialTapGesture(count: 1) and SpatialTapGesture(count: 2), but realized they override each other.
SpatialEventGesture implementation: Switched to SpatialEventGesture to handle multiple event types (.touch, .indirectPinch, .pointer), but pinch gestures still don't trigger the menu.
Added debugging: Console logs show that the gesture callbacks are never called when performing pinch gestures in the immersive space.
Backup LongPressGesture: Added a simultaneous long press gesture as backup, which also doesn't work consistently.
Expected Behavior
When the panorama menu is hidden (after 5-second auto-hide), users should be able to:
Perform a pinch gesture (indirect pinch) to show the menu
Tap in space to show the menu
Use other spatial gestures to show the menu
Questions
Is SpatialEventGesture the correct approach for detecting gestures in a full immersive RealityView?
Are there any special considerations for gesture detection when the RealityView contains a large panoramic sphere that might be intercepting gestures?
Should I be using a different gesture approach for visionOS immersive spaces?
Is there a way to ensure gestures work even when the RealityView content (panoramic sphere) might be blocking them?
Environment
Xcode 16.1
visionOS 2.5
Testing on Vision Pro device
App uses SwiftUI + RealityKit
Any guidance on the proper way to implement spatial gesture detection in visionOS immersive spaces would be greatly appreciated!
Additional Context
The app manages multiple windows and the gesture detection should work specifically when in the immersive panorama mode with the menu hidden.
Thank you for any help or suggestions!
I'm a novice in RealityKit and ARKit. I'm using ARKit in SwiftUI to show a cube with a number as shown below.
import SwiftUI
import RealityKit
import ARKit
struct ContentView : View {
var body: some View {
return ARViewContainer()
}
}
#Preview {
ContentView()
}
struct ARViewContainer: UIViewRepresentable {
typealias UIViewType = ARView
func makeUIView(context: UIViewRepresentableContext<ARViewContainer>) -> ARView {
let arView = ARView(frame: .zero, cameraMode: .ar, automaticallyConfigureSession: true)
arView.enableTapGesture()
return arView
}
func updateUIView(_ uiView: ARView, context: UIViewRepresentableContext<ARViewContainer>) {
}
}
extension ARView {
func enableTapGesture() {
let tapGestureRecognizer = UITapGestureRecognizer(target: self, action: #selector(handleTap(recognizer:)))
self.addGestureRecognizer(tapGestureRecognizer)
}
@objc func handleTap(recognizer: UITapGestureRecognizer) {
let tapLocation = recognizer.location(in: self) // print("Tap location: \(tapLocation)")
guard let rayResult = self.ray(through: tapLocation) else { return }
let results = self.raycast(from: tapLocation, allowing: .estimatedPlane, alignment: .any)
if let firstResult = results.first {
let position = simd_make_float3(firstResult.worldTransform.columns.3)
placeObject(at: position)
}
}
func placeObject(at position: SIMD3<Float>) {
let mesh = MeshResource.generateBox(size: 0.3)
let material = SimpleMaterial(color: UIColor.systemRed, roughness: 0.3, isMetallic: true)
let modelEntity = ModelEntity(mesh: mesh, materials: [material])
var unlitMaterial = UnlitMaterial()
if let textureResource = generateTextResource(text: "1", textColor: UIColor.white) {
unlitMaterial.color = .init(tint: .white, texture: .init(textureResource))
modelEntity.model?.materials = [unlitMaterial]
let id = UUID().uuidString
modelEntity.name = id
modelEntity.transform.scale = [0.3, 0.1, 0.3]
modelEntity.generateCollisionShapes(recursive: true)
let anchorEntity = AnchorEntity(world: position)
anchorEntity.addChild(modelEntity)
self.scene.addAnchor(anchorEntity)
}
}
func generateTextResource(text: String, textColor: UIColor) -> TextureResource? {
if let image = text.image(withAttributes: [NSAttributedString.Key.foregroundColor: textColor], size: CGSize(width: 18, height: 18)), let cgImage = image.cgImage {
let textureResource = try? TextureResource(image: cgImage, options: TextureResource.CreateOptions.init(semantic: nil))
return textureResource
}
return nil
}
}
I tap the floor and get a cube with '1' as shown below.
The background color of the cube is black, I guess. Where does this color come from and how can I change it into, say, red? Thanks.
Reality Composer is no longer available in XCode 15 Release. Is this intended or will be available in later releases? Do I need to revert to Xcode15 Beta8 to get Reality Composer?
Hello everyone
I would like to create my own spatial video on my Apple Vision Pro. According to all the documentation from Apple, this requires two camera angles that enhance the spatial perception. I have purchased the Enterprise license with main camera access for this purpose. However, this only gives me access to the left main camera of the glasses. Is there a way to access the right camera as well? Or is the one camera image enough to create a spatial video by splitting the image, for example?
I am open to any help and ideas. My goal is to create the video with the cameras on the glasses, not externally.
I'm playing about with the hand tracking systems in reality kit / Vision Pro
I thought it would be interesting if I could attach a virtual object to a hand when the hand is gripping (thought it would be fun to attach a basic cylinder to mimic a wand from Harry Potter)
I'm able to detect when the user is gripping but having trouble placing an object as though it's within the hand.
The simplest version of this is using an AnchorEntity pointing to the user's palm which kind of works, but quickly breaks the illusion when you rotate the wrist or hand.
It seems as though I will have to roll my own anchor entity using the various points of the user's hand and I thought calculating some median point between the thumb and little finger tips would be a good start but it's proven a little difficult as we need both rotation and position.
I'm already out of my depth with reality kit and matrices (and thanks to ChatGPT) I have some code, but as soon as I apply the position manually (as opposed to a hand anchor entity) it fails to render on the user's hand.
It feels like this should already have been something someone has looked in to, any ideas on what might be the issue here?
Note: HandTrackingSystem.handTracking is a HandTrackingProvider()
guard let anchors = HandTrackingSystem.handTracking.latestAnchors.leftHand else {
return
}
if
let thumb = anchors.handSkeleton?.joint(.thumbTip),
let little = anchors.handSkeleton?.joint(.littleFingerTip)
{
let thumbPos = simd_make_float3(thumb.anchorFromJointTransform.columns.3)
let littlePos = simd_make_float3(little.anchorFromJointTransform.columns.3)
let midPos = (thumbPos + littlePos) / 2
let direction = normalize(littlePos - thumbPos)
let rotation = simd_quatf(from: [0, 1, 0], to: direction)
wandEntity.transform.translation = midPos
wandEntity.transform.rotation = rotation
content.add(wandEntity)
}
Hi,
I'm developing an app for the Apple Vision Pro.
Inside the app the user should be able to load objects from the web into the scene and then be able to move them around (dragging and rotating) via gestures.
My question:
I'm working with RealityKit and use RealityView..
I have no issues loading in one object and making it interactive by adding gestures to the entire RealityView via the .gestures() function.
Also I succeeded in loading multiple objects into the scene.
My problem is that I can't figure out how to add my gestures to multiple objects independently.
I can't use Reality composer since I'm loading the objects dynamically into the scene.
Using .gestures() doesn't work for multiple objects since every gesture needs to be targeted to a specific entity but I have multiple entities.
I also tried defining a GestureComponent and adding it to every newly loaded entity but that doesn't seem to work as nothing happens,
even though my gesture is targeted to every entity having my GestureComponent.
I only found solutions that are not usable on Visionos / RealityView like installGestures.
Also I tried following this guide: https://developer.apple.com/documentation/realitykit/transforming-realitykit-entities-with-gestures
But I feel like there are things missing and contradictory, like a extension of RealityView is mentioned but not shown and the guide states that we don't need to store the translation values for every entity and therefore creates a EntityGestureState.swift file to store these values but then it's not used and a GestureStateComponent is used instead that was never mentioned and contradicts what was just said about not storing values per entity because a new instance of it is created for every entity. Nevermind, following the guide didn't lead me to a working solution.
I see no way to scale an entity with a hover effect.
The closest I can find is by using HoverEffectComponent with a shader hover effect. Maybe I can change the scale with a ShaderGraph, but I cannot figure out how.
I am develop visionOS app. I am now very interested in Metal and Compositor Services, but I have not explored them in depth. I know that Metal has a higher degree of control freedom. I am wondering if using Compositor Services will have fewer functions than RealityKit in AR technology (such as scene reconstruction and understanding, hover effect, etc.).
My app for framing and arranging pictures from Photos on visionOS allows users to write the arrangements they create to .reality files using RealityKit entity.write(to:) that they then display to customers on their websites. This works perfectly on visionOS 2, but fails with a fatal protection error on visionOS 26 beta 1 and beta 2 when write(to:) attempts to write to its internal cache:
2025-06-29 14:03:04.688 Failed to write reality file Error Domain=RERealityFileWriterErrorDomain Code=10 "Could not create parent folders for file path /var/mobile/Containers/Data/Application/81E1DDC4-331F-425D-919B-3AB87390479A/Library/Caches/com.GeorgePurvis.Photography.FrameItVision/RealityFileBundleZippingTmp_A049685F-C9B2-479B-890D-CF43D13B60E9/41453BC9-26CB-46C5-ADBE-C0A50253EC27."
UserInfo={NSLocalizedDescription=Could not create parent folders for file path /var/mobile/Containers/Data/Application/81E1DDC4-331F-425D-919B-3AB87390479A/Library/Caches/com.GeorgePurvis.Photography.FrameItVision/RealityFileBundleZippingTmp_A049685F-C9B2-479B-890D-CF43D13B60E9/41453BC9-26CB-46C5-ADBE-C0A50253EC27.}
Has anyone else encountered this problem? Do you have a workaround? Have you filed a feedback?
ChatGPT analysis of the error and my code reports:
Why there is no workaround
• entity.write(to:) is a black box — you cannot override where it builds its staging bundle
• it always tries to create those random folders itself
• you cannot supply a parent or working directory to RealityFileWriter
• so if the system fails to create that folder, you cannot patch it
👉 This is why you see a fatal error with no recovery.
See also feedbacks: FB18494954, FB18036627, FB18063766