简体   繁体   English

OpenCV从2D像素获取3D坐标

[英]OpenCV get 3D coordinates from 2D pixel

For my undergraduate paper I am working on a iPhone Application using openCV to detect domino tiles. 对于我的本科论文,我正在使用openCV开发iPhone应用程序以检测多米诺骨牌。 The detection works well in close areas, but when the camera is angled the tiles far away are difficult to detect. 该检测在近距离区域效果很好,但是当摄像机倾斜时,很难检测到远处的瓷砖。 My approach to solve this I would want to do some spacial calculations. 解决此问题的方法是,我想进行一些空间计算。 For this I would need to convert a 2D Pixel value into world coordinates, calculate a new 3D position with a vector and convert these coordinates back to 2D and then check the colour/shape at that position. 为此,我需要将2D像素值转换为世界坐标,使用矢量计算新的3D位置,然后将这些坐标转换回2D,然后检查该位置的颜色/形状。

Additionally I would need to know the 3D positions for Augmented Reality additions. 另外,我需要了解增强现实技术的3D位置。

The Camera Matrix i got trough this link create opencv camera matrix for iPhone 5 solvepnp 我通过这个链接通过相机矩阵创建适用于iPhone 5的opencv相机矩阵solvepnp

The Rotationmatrix of the Camera I get from the Core Motion. 我从Core Motion获得的相机的旋转矩阵。

Using Aruco markers would be my last resort, as I woulnd't get the decided effect that I would need for the paper. 使用Aruco记号笔将是我的最后选择,因为我无法获得纸张所需的确定效果。

Now my question is, can i not make calculations when I know the locations and distances of the circles on a lets say Tile with a 5 on it? 现在我的问题是,当我知道一个有5的平铺瓷砖上的圆圈的位置和距离时,我可以不进行计算吗? I wouldn't need to have a measurement in mm/inches, I can live with vectors without measurements. 我不需要以毫米/英寸为单位进行测量,我可以使用没有测量值的矢量。

The camera needs to be able to be rotated freely. 相机需要能够自由旋转。

I tried to invert the calculation sm'=A[R|t]M' to be able to calculate the 2D coordinates in 3D. 我试图将计算sm'= A [R | t] M'求反,以便能够计算3D中的2D坐标。 But I am stuck with inverting the [R|t] even on paper, and I don't know either how I'd do that in swift or c++. 但是我甚至在纸上都想将[R | t]反转,我也不知道如何用swift或c ++做到这一点。

I have read so many different posts on forums, in books etc. and I am completely stuck and appreciate any help/input you can give me. 我在论坛,书籍等上阅读了很多不同的文章,我完全被困住了,并感谢您能给我的任何帮助/投入。 Otherwise I'm screwed. 否则我就被搞砸了。

Thank you so much for your help. 非常感谢你的帮助。

Update: 更新:

By using the solvePnP that was suggested by Micka I was able to get the Rotation and Translation Vectors for the angle of the camera. 通过使用Micka建议的SolvePnP,我可以获取相机角度的“旋转”和平移矢量。 Meaning that if you are able to identify multiple 2D Points in your image and know their respective 3D World coordinates (in mm, cm, inch, ...), then you can get the mechanisms to project points from known 3D World coordinates onto the respective 2D coordinates in your image. 这意味着,如果您能够识别图像中的多个2D点并知道它们各自的3D世界坐标(以mm,cm,inch,...为单位),则可以使用将已知3D世界坐标中的点投影到坐标上的机制。图像中的各个2D坐标。 (use the opencv projectPoints function). (使用opencv projectPoints函数)。

What is up next for me to solve is the translation from 2D into 3D coordinates, where I need to follow ozlsn's approach with the inverse of the received matrices out of solvePnP. 接下来要解决的是从2D到3D坐标的转换,在这里我需要遵循ozlsn的方法,从solvePnP接收矩阵的逆。

Update 2: With a top down view I am getting along quite well to being able to detect the tiles and their position in the 3D world: tile from top Down 更新2:使用俯视图,我可以很好地检测到图块及其在3D世界中的位置: 自上而下的图块

However if I am now angling the view, my calculations are not working anymore. 但是,如果我现在正在倾斜视图,则我的计算不再起作用。 For example I check the bottom Edge of a 9-dot group and the center of the black division bar for 90° angles. 例如,我检查9点组的底部边缘和黑色分隔条的中心是否有90°角。 If Corner1 -> Middle Edge -> Bar Center and Corner2 -> Middle Edge -> Bar Center are both 90° angles, than the bar in the middle is found and the position of the tile can be found. 如果Corner1-> Middle Edge->钢筋中心和Corner2-> Middle Edge->钢筋中心都是90°角,则找到中间的钢筋,并且可以找到瓷砖的位置。

When the view is Angled, then these angles will be shifted due to the perspective to lets say 130° and 50°. 当视图成角度时,这些角度将由于透视而移动,例如可以说是130°和50°。 (I'll provide an image later). (我稍后会提供图片)。

The Idea I had now is to make a solvePNP of 4 Points (Bottom Edge plus Middle), claculate solvePNP and then rotate the needed dots and the center bar from 2d position to 3d position (height should be irrelevant?). 我现在的想法是制作一个4点的solvePNP(底边加中间),包容solvePNP,然后将所需的点和中心条从2d位置旋转到3d位置(高度应该无关紧要吗?)。 Then i could check with the translated points if the angles are 90° and do also other needed distance calculations. 然后,我可以检查平移点的角度是否为90°,并进行其他所需的距离计算。

Here is an image of what I am trying to accomplish: Markings for Problem 这是我要完成的任务的图像: 问题标记

I first find the 9 dots and arrange them. 我首先找到9个点并进行排列。 For each Edge I try to find the black bar. 对于每个Edge,我尝试找到黑条。 As said above, seen from Top, the angle blue corner, green middle edge to yellow bar center is 90°. 如上所述,从顶部看,蓝色角,绿色中间边缘与黄色条中心的夹角为90°。 However, as the camera is angled, the angle is not 90° anymore. 但是,当摄像机倾斜时,该角度不再为90°。 I also cannot check if both angles are 180° together, that would give me false positives. 我也无法检查两个角度是否都为180°,这会给我带来误报。 So I wanted to do the following steps: 因此,我想执行以下步骤:

  1. Detect Center 检测中心
  2. Detect Edges (3 dots) 检测边缘(3个点)
  3. SolvePnP with those 4 points 用这4点来解决
  4. rotate the edge and the center points (coordinates) to 3D positions 将边缘和中心点(坐标)旋转到3D位置
  5. Measure the angles (check if both 90°) 测量角度(检查两个角度是否均为90°)

Now I wonder how I can transform the 2D Coordinates of those points to 3D. 现在,我想知道如何将这些点的2D坐标转换为3D。 I don't care about the distance, as I am just calculating those with reference to others (like 1.4 times distance Middle-Edge) etc., if I could measure the distance in mm, that would even be better though. 我不在乎距离,因为我只是参考其他对象(例如1.4倍距离Middle-Edge)等进行计算,如果我可以以mm为单位测量距离,那会更好。 Would give me better results. 会给我更好的结果。

With solvePnP I get the rvec which I could change into the rotation Matrix (with Rodrigues() I believe). 通过solvePnP,我得到了可以转换为旋转矩阵的rvec(我相信可以使用Rodrigues())。 To measure the angles, my understanding is that I don't need to apply the translation (tvec) from solvePnP. 为了测量角度,我的理解是,我不需要应用solvePnP的转换(tvec)。

This leads to my last question, when using the iPhone, can't I use the angles from the motion detection to build the rotation matrix beforehand and only use this to rotate the tile to show it from the top? 这就引出了我的最后一个问题,当使用iPhone时,我不能使用运动检测中的角度预先构建旋转矩阵,而仅使用它来旋转瓦片以从顶部显示它吗? I feel that this would save me a lot of CPU Time, when I don't have to solvePnP for each tile (there can be up to about 100 tile). 当我不必为每个图块求解PnP(最多可以有约100个图块)时,我觉得这将为我节省很多CPU时间。

Find Homography 查找单应性

vector<Point2f> tileDots;
tileDots.push_back(corner1);
tileDots.push_back(edgeMiddle);
tileDots.push_back(corner2);
tileDots.push_back(middle.Dot->ellipse.center);

vector<Point2f> realLivePos;
realLivePos.push_back(Point2f(5.5,19.44));
realLivePos.push_back(Point2f(12.53,19.44));
realLivePos.push_back(Point2f(19.56,19.44));
realLivePos.push_back(Point2f(12.53,12.19));

Mat M = findHomography(tileDots, realLivePos, CV_RANSAC);

cout << "M = "<< endl << " "  << M << endl << endl;

vector<Point2f> barPerspective;
barPerspective.push_back(corner1);
barPerspective.push_back(edgeMiddle);
barPerspective.push_back(corner2);
barPerspective.push_back(middle.Dot->ellipse.center);
barPerspective.push_back(possibleBar.center);
vector<Point2f> barTransformed;

if (countNonZero(M) < 1)
{
    cout << "No Homography found" << endl;
} else {
    perspectiveTransform(barPerspective, barTransformed, M);
}

This however gives me wrong values, and I don't know anymore where to look (Sehe den Wald vor lauter Bäumen nicht mehr). 但是,这给了我错误的值,并且我不知道该去哪里找东西(Sehe den Wald vor lauterBäumennicht mehr)。

Image Coordinates https://i.stack.imgur.com/c67EH.png
World Coordinates https://i.stack.imgur.com/Im6M8.png
Points to Transform https://i.stack.imgur.com/hHjBM.png
Transformed Points https://i.stack.imgur.com/P6lLS.png

You see I am even too stupid to post 4 images here??!!? 您看我什至太愚蠢,无法在此处发布4张图片?

The 4th index item should be at x 2007 y 717. I don't know what I am doing wrongly here. 第四个索引项应为x 2007 y717。我不知道我在这里做错了什么。

Update 3: I found the following post Computing x,y coordinate (3D) from image point which is doing exactly what I need. 更新3:我发现以下文章从图像点计算x,y坐标(3D)正是我需要的。 I don't know maybe there is a faster way to do it, but I am not able to find it otherwise. 我不知道也许有更快的方法,但是我找不到其他方法。 At the moment I can do the checks, but still need to do tests if the algorithm is now robust enough. 目前,我可以进行检查,但是如果算法现在足够健壮,则仍然需要进行测试。

Result with SolvePnP to find bar Center SolvePnP的结果找到条中心

The matrix [R|t] is not square, so by-definition, you cannot invert it. 矩阵[R | t]不是正方形,因此按定义,您不能对其进行求逆。 However, this matrix lives in the projective space, which is nothing but an extension of R^n (Euclidean space) with a '1' added as the (n+1)st element. 但是,该矩阵存在于投影空间中,它只是R ^ n(欧几里德空间)的扩展,添加了一个“ 1”作为第(n + 1)个元素。 For compatibility issues, the matrices that multiplies with vectors of the projective space are appended by a '1' at their lower-right corner. 对于兼容性问题,与投影空间向量相乘的矩阵在其右下角附加了“ 1”。 That is : R becomes 即:R变为

[R|0]
[0|1]

In your case [R|t] becomes 在您的情况下[R | t]变为

[R|t]
[0|1]

and you can take its inverse which reads as 您可以将其取为

[R'|-Rt]
[0 | 1 ]

where ' is a transpose. 其中'是转置。 The portion that you need is the top row. 您需要的部分是第一行。

Since the phone translates in the 3D space, you need the distance of the pixel in consideration. 由于手机在3D空间中平移,因此需要考虑像素的距离。 This means that the answer to your question about whether you need distances in mm/inches is a yes. 这意味着您是否需要以毫米/英寸为单位的距离这个问题的答案是肯定的。 The answer changes only if you can assume that the ratio of camera translation to the depth is very small and this is called weak perspective camera . 仅当您可以假设摄影机平移与深度的比率非常小且称为弱透视摄影机时 ,答案才会改变。 The question that you're trying to tackle is not an easy one. 您要解决的问题并非易事。 There is still people researching on this at PhD degree. 仍然有人在博士学位上对此进行研究。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM