I'm prototyping an app where I use CoreML to identify an object. That gives me a bounding box for the object (which has 4 values all between 0 and 1). I'd like to use the ARDepthData I have access to thanks to having a phone with LiDAR to then measure the distance to that object.
The CVPixelBuffer of sceneview.session.currentFrame?.capturedImage
has dimensions 1920 x 1440. The CVPixelBuffer of sceneview.session.currentFrame?.sceneDepth.depthMap
has dimensions 256 x 192.
How do I convert the bounding box of the VNRecognizedObjectObjservation object to give me the depth data I need to estimate the distance to the object?
Converting bounds of Vision
requests can be tricky. Here is very helpful article on the subject:
https://machinethink.net/blog/bounding-boxes/
Also, I think new since the above article was written, there are some new helpful Vision
functions such as VNImageRectForNormalizedRect
.
let depthMapSize = depthMap.size
let boundingBox // your 0.0-1.0 bounds
let depthBounds = VNImageRectForNormalizedRect(boundingBox, depthMap.width, depthMap.height)
It returns image coordinates projected from a rectangle in a normalized coordinate space (which is what you have from Vision).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.