简体繁体 English

用于从视频馈送确定房间大小的算法

[英]Algorithm to determine size of a room from video feed

原文 2011-09-28 01:02:47 1 1 algorithm/ image-processing/ opencv/ video-processing

Does anybody know of a image analysis algorithm with which I can determine how large(approximately, in real-life measurements, let's say width in meters or something) a room is out of one(or multiple) video recordings of this room? 有没有人知道一个图像分析算法，通过这个算法，我可以确定一个房间在这个房间的一个（或多个）视频录像中有多大（大约在现实生活中的测量中，比如宽度，以米为单位）？

I'm currently using OpenCV as my image library of choice, but I haven't gotten very far in terms of learning image analysis algorithms, just a name drop would be fine. 我目前正在使用OpenCV作为我选择的图像库，但我在学习图像分析算法方面并没有走得太远，只需要一个名字下降就可以了。

Thanks 谢谢

Edit: Okay, a little bit of clarification I just got from the people involved. 编辑：好的，我刚从相关人员那里得到一点澄清。 I basically have no control how the video feed is taken, and can't guarantee that there are multiple data sources. 我基本上无法控制如何拍摄视频，也无法保证有多个数据源。 I however have a certain points location in the room and I'm supposed to place something in relation to that point. 然而，我在房间里有一个点位置，我应该放置与这一点有关的东西。 So I would probably looking at trying to identify the edges of the room, then identifying how far procentual the given location is in the room and then guess how large the room is. 因此，我可能会尝试识别房间的边缘，然后确定给定位置在房间内的程度，然后猜测房间的大小。

1 个解决方案

Awfully difficult (yet interesting!) problem. 非常困难（但有趣！）问题。

If you are thinking in doing this in a completely automated way I think you'll have a lots of issues. 如果您正在考虑以完全自动化的方式执行此操作，我认为您将遇到很多问题。 But I think this is doable if an operator can mark control points in a set of pictures. 但我认为如果操作员可以在一组图片中标记控制点，这是可行的。

Your problem can be stated more generally as finding the distance between two points in 3D space, when you only have the locations of these points in two or more 2D pictures taken from different points of view. 当您从两个或多个从不同视点拍摄的2D图片中仅获得这些点的位置时，您的问题可以更一般地说明为在3D空间中找到两个点之间的距离。 The process will work more or less like this: 这个过程或多或少会像这样：

The pictures will come with camera location and orientation information. 图片将附带相机位置和方向信息。 For example, let's say that you get two pictures, with the same camera orientation and where the two pictures were taken with the camera three feet apart horizontally. 例如，假设你得到两张照片，相同的相机方向，两张照片是用水平相距三英尺的相机拍摄的。 You will have to define a reference origin for the 3D space in which the cameras are located, for example, you can say that the left picture has the camera at (0,0,0) and the right picture at (3,0,0), and both will be facing forward, which would be an orientation of (0,0,1). 您必须为摄像机所在的3D空间定义参考原点，例如，您可以说左侧图像的摄像机位于（0,0,0），右侧图像位于（3,0， 0），两者都将面向前方，这将是（0,0,1）的方向。 Or something like that. 或类似的东西。
Now the operator comes and marks the two corners of the room in both pictures. 现在，操作员在两张照片中标出了房间的两个角落。 So you have 2 sets of 2D coordinates for each corner. 因此，每个角落都有2组2D坐标。
You must know the details of your camera and lens (field of view, lens distortion, aberrations, etc.). 您必须知道相机和镜头的细节（视野，镜头失真，像差等）。 The more you know about how your camera deforms the image the more accurate you can make your estimate. 您对相机如何使图像变形的了解越多，您的估算就越准确。 This is the same stuff panorama stitching software do to achieve a better stitch. 这是同样的东西全景拼接软件做到更好的缝合。 See PanoTools for info on this. 有关此信息，请参阅PanoTools 。
Here comes the fun part: you will now do the inverse of a perspective projection for each of your 2D points. 有趣的是：现在，您将对每个2D点进行透视投影的反转。 The perspective projection takes a point in 3D space and a camera definition and computes a 2D point. 透视投影在3D空间和相机定义中占据一个点并计算2D点。 This is used to represent tridimensional objects in a flat surface, like a computer screen. 这用于表示平面中的三维对象，如计算机屏幕。 You are doing the reverse of that, for each 2D point you will try to obtain a 3D coordinate. 与此相反，对于每个2D点，您将尝试获取3D坐标。 Since there isn't enough information in a 2D point to determine depth, the best you can do from a single 2D point is obtain a line in 3D space that passes through the lens and through the point in question, but you don't know how far from the lens the point is. 由于2D点中没有足够的信息来确定深度，因此从单个2D点可以做到的最好的事情是在3D空间中获得穿过镜头并穿过相关点的线，但是您不知道距离镜头有多远。 But you have the same 2D point in two images, so you can compute two 3D lines from different camera locations. 但是，您在两个图像中具有相同的2D点，因此您可以从不同的摄像机位置计算两条3D线。 These lines will not be parallel, so they will intersect at a single point. 这些线不是平行的，因此它们将在一个点上相交。 The intersection point of the 3D lines will be a good estimation of the location of the 3D point in space, and in the coordinates of your reference camera 3D space. 3D线的交点将很好地估计3D点在空间中的位置，以及参考相机3D空间的坐标。
The rest is easy. 其余的很容易。 When you have the estimated 3D locations of the two points of interest, you just compute the 3D distance between them, and that's the number that you want. 当您拥有两个兴趣点的估计3D位置时，您只需计算它们之间的3D距离，这就是您想要的数字。