简体繁体 English

从Iphone（或消费者相机）拍摄深度图像

[英]Taking Depth Image From Iphone (or consumer camera)

原文 2012-04-22 22:07:03 5 2 iphone/ image-processing/ artificial-intelligence/ computer-vision/ detection

I have read that it's possible to create a depth image from a stereo camera setup (where two cameras of identical focal length/aperture/other camera settings take photographs of an object from an angle). 我已经读过可以从立体摄像机设置创建深度图像（其中两个相同焦距/光圈/其他摄像机设置的摄像机从一个角度拍摄物体的照片）。

Would it be possible to take two snapshots almost immediately after each other(on the iPhone for example) and use the differences between the two pictures to develop a depth image? 是否可以几乎立即拍摄两张快照（例如在iPhone上）并使用两张图片之间的差异来开发深度图像？

Small amounts of hand-movement and shaking will obviously rock the camera creating some angular displacement, and perhaps that displacement can be calculated by looking at the general angle of displacement of features detected in both photographs. 少量的手动和摇动会明显地使相机晃动产生一些角位移，也许可以通过观察两张照片中检测到的特征的一般位移角来计算位移。

2 个解决方案

Another way to look at this problem is as structure-from-motion , a nice review of which can be found here . 另一种看待这个问题的方法是从动作结构 ，这里可以找到一个很好的评论。

Generally speaking, resolving spatial correspondence can also be factored as a temporal correspondence problem. 一般而言，解析空间对应也可以作为时间对应问题。 If the scene doesn't change, then taking two images simultaneously from different viewpoints - as in stereo - is effectively the same as taking two images using the same camera but moved over time between the viewpoints. 如果场景没有改变，那么从不同视点同时拍摄两个图像 - 如在立体声中 - 实际上与使用相同相机拍摄两个图像但在视点之间随时间移动相同。

I recently came upon a nice toy example of this in practice - implemented using OpenCV . 我最近在实践中遇到了一个很好的玩具示例 - 使用OpenCV实现。 The article includes some links to other, more robust, implementations. 本文包含一些指向其他更强大的实现的链接。

For a deeper understanding I would recommend you get hold of an actual copy of Hartley and Zisserman's "Multiple View Geometry in Computer Vision" book. 为了更深入的理解，我建议您掌握Hartley和Zisserman的“计算机视觉中的多视图几何”一书的实际副本。

You could probably come up with a very crude depth map from a "cha-cha" stereo image (as it's known in 3D photography circles) but it would be very crude at best. 你可能想出一个非常粗糙的深度图来自“cha-cha”立体图像（因为它在3D摄影圈中已知），但它最好是非常粗糙的。

Matching up the images is EXTREMELY CPU-intensive. 匹配图像是非常耗费CPU的。

An iPhone is not a great device for doing the number-crunching. iPhone不是进行数字运算的好设备。 It's CPU isn't that fast, and memory bandwidth isn't great either. 它的CPU不是那么快，内存带宽也不是很好。

Once Apple lets us use OpenCL on iOS you could write OpenCL code, which would help some. 一旦Apple允许我们在iOS上使用OpenCL，您就可以编写OpenCL代码，这对一些人有帮助。