简体   繁体   English

计算轴未对齐的两个矩形之间的相交面积

[英]Calculating the intersection area between two rectangles with axes not aligned

I want to calculate the intersection over union IoU between two rectangles with axes not aligned, but with an angle of the axes smaller than 30 degrees. 我想计算轴未对齐但轴角小于30度的两个矩形之间的联合IoU的交点。 An approximate value is also seeked. 还寻求一个近似值。

One possible solution is to check if the angle between the two rectangles is less than 30 degree and than rotate them parallel to aligne the axis. 一种可能的解决方案是检查两个矩形之间的角度是否小于30度,然后平行旋转以对齐轴。 From here it is easy to calculate the IoU . 从这里很容易计算IoU

Another possibility is to use monte carlo methods for the intersection ( generate a point, find if the point is under some line of one rectangle and above some line of the other), but this seems expensive because I need to use this calculation a large number of times. 另一种可能性是对交叉使用蒙特卡洛方法(生成一个点,找到该点是否在一个矩形的某条线之下,而在另一矩形的某条线之上),但这似乎很昂贵,因为我需要大量使用此计算的时间。

I was hopping that there is something better out there; 我希望那里有更好的东西。 maybe a geometry library, or maybe an algorithm from the computer vision folks. 也许是几何库,或者也许是计算机视觉专家的算法。

I am trying to learn grasping positions using deep neural networks. 我正在尝试使用深度神经网络来学习掌握位置。 My algorithem should predict a bounding box (rectangle) for an object in an rgb image. 我的算法应该为rgb图像中的对象预测边界框(矩形)。 For any image I have also the ground truth (another rectangle) bounding box. 对于任何图像,我还具有地面实况(另一个矩形)边界框。 From this two rectangles I need the IoU . 从这两个矩形中,我需要IoU

Any idea? 任何想法?

There is quite effective algorithm for calculation of intersection between two convex polygons, described in O'Rourke book "Computational Geometry in C". 有一种非常有效的算法可以计算两个凸多边形之间的交点,这在O'Rourke的著作《 C中的计算几何》中有描述。

C code is available at the book page (convconv). 书籍页面 (convconv)上提供 C代码。

Algorithm traverse edges of the first polygon, checking orientations of the second polygon vertices in order to detect intersections. 算法遍历第一多边形的边缘,检查第二多边形顶点的方向以检测相交。 When two consequent vertices lie on the different sides of the edge, intersection occurs (there is a lot of trick cases). 当两个随后的顶点位于边的不同侧时,会发生相交(有很多特技情况)。 Algorithm outline is here 算法概述在这里

You can consider a number of numerical approaches, practically "rendering" the rectangles into some "canvas"/canvases, and traverse the pixels for making your statistics. 您可以考虑多种数值方法,实际上是将矩形“渲染”为一些“画布” /画布,并遍历像素以进行统计。 The size of the canvas should be the size of the bounding box for the entire scene, practically you can find that via picking the minimum and maximum coordinates occurring for each axis. 画布的大小应该是整个场景的边界框的大小,实际上,您可以通过选择每个轴上出现的最小和最大坐标来发现这一点。

1) "most CG" approach: really get a rendering library, render one rectangle with red, other rectangle with transparent blue. 1)“最CG”方法:真正获得一个渲染库,用红色渲染一个矩形,用透明蓝色渲染另一个矩形。 Then visit each pixel and if it has a non-0 red component, it belongs to the first rectangle, if it has a non-0 blue component, it belongs to the second rectangle. 然后访问每个像素,如果它的红色分量非零,则属于第一个矩形;如果它的蓝色分量非零,则属于第二个矩形。 And if it has both, it belongs to the intersection too. 如果两者兼有,它也属于交集。 This approach is cheap for coding, but requires both writing and reading the canvas even in the rendering phase, which is slower than just writing. 这种方法对于编码来说很便宜,但是即使在渲染阶段也需要写和读画布,这比只写要慢。 This might be even done on GPU too, though I am not sure if setup costs and getting back the result do not weight out the benefit for such a simple scene. 甚至可以在GPU上完成此操作,尽管我不确定设置成本和获取结果是否不会增加这种简单场景的收益。

2) another CG-approach would be rendering into 2 arrays, preferably some 1-byte-per-pixel variant, for the sake of speed (you may have to go back in time a bit in order to find such dedicated rendering libraries). 2)为了提高速度,另一个CG方法将渲染为2个数组,最好是每个像素1个字节的变体(您可能需要倒退一点时间才能找到这样的专用渲染库)。 This way the renderer only writes, into one array per rectangle, and you read from two when creating the statistics 这样,渲染器仅将每个矩形写入一个数组,并且在创建统计信息时从两个数组读取

3) as writing and reading pixels take time, you can do some shortcut, but it needs more coding: convex shapes can be rendered via collecting the minimum and maximum coordinates per scanline, and just filling between the two. 3)由于读写像素需要时间,您可以做一些捷径,但需要更多编码:可以通过收集每条扫描线的最小和最大坐标,并在两者之间进行填充,来呈现凸形。 If you do it yourself, you can spare the filling part and also the read-and-check-every-pixel step at the end. 如果您自己动手,则可以省去填充部分,最后还可以读取并检查每个像素。 Build such min-max list for both rectangles, and then you "just" have to check their relation/order for each scanline, to recognize overlaps 为两个矩形建立这样的最小-最大列表,然后您“只需”检查每个扫描线的关系/顺序,以识别重叠

And then there is the mathematical way: this is not really useful, see EDIT below while it is unlikely that you would find some sane algorithm for calculating intersection area, specifically for the case of rectangles, if you find such algorithm for triangles, which is more probable, that would be enough. 然后是一种数学方法: 这并不是真正有用,请参阅下面的EDIT ,如果您发现三角形的算法,则不太可能找到一些理智的算法来计算相交面积,特别是对于矩形的情况。更可能,那就足够了。 Both rectangles can be split into two triangles, 1A+1B and 2A+2B respectively, and then you just have to run such algorithm 4 times: 1A-2A, 1A-2B, 1B-2A, 1B-2B, sum the results and that is the area of your intersection. 可以将两个矩形分别分成两个三角形1A + 1B和2A + 2B,然后只需运行4次这样的算法:1A-2A,1A-2B,1B-2A,1B-2B,将结果求和那就是你的路口面积。

EDIT: for the maths approach (though this also comes from graphics), I think https://en.wikipedia.org/wiki/Sutherland%E2%80%93Hodgman_algorithm can be applied here (as both rectangles are convex polygons, AB and BA should produce the same result) for finding the intersection polygon, and then the remaining task is to calculate the area of that polygon (here and now I think it is going to be convex, and then it is really easy). 编辑:对于数学方法(尽管这也来自图形),我认为https://en.wikipedia.org/wiki/Sutherland%E2%80%93Hodgman_algorithm可以在此处应用(因为两个矩形都是凸多边形,AB和BA应当产生相同的结果)以找到相交多边形,然后剩下的任务是计算该多边形的面积(此时此刻我认为它将变成凸形,这确实很容易)。

由于您使用的是Python,因此我认为Shapely软件包将满足您的需求。

I ended up using Sutherland-Hodgman algorithm implemented as this functions: 我最终使用了通过以下功能实现的Sutherland-Hodgman算法:

def clip(subjectPolygon, clipPolygon):
   def inside(p):
      return(cp2[0]-cp1[0])*(p[1]-cp1[1]) > (cp2[1]-cp1[1])*(p[0]-cp1[0])

   def computeIntersection():
      dc = [ cp1[0] - cp2[0], cp1[1] - cp2[1] ]
      dp = [ s[0] - e[0], s[1] - e[1] ]
      n1 = cp1[0] * cp2[1] - cp1[1] * cp2[0]
      n2 = s[0] * e[1] - s[1] * e[0] 
      n3 = 1.0 / (dc[0] * dp[1] - dc[1] * dp[0])
      return [(n1*dp[0] - n2*dc[0]) * n3, (n1*dp[1] - n2*dc[1]) * n3]

   outputList = subjectPolygon
   cp1 = clipPolygon[-1]

   for clipVertex in clipPolygon:
      cp2 = clipVertex
      inputList = outputList
      outputList = []
      s = inputList[-1]

      for subjectVertex in inputList:
         e = subjectVertex
         if inside(e):
            if not inside(s):
               outputList.append(computeIntersection())
            outputList.append(e)
         elif inside(s):
            outputList.append(computeIntersection())
         s = e
      cp1 = cp2
   return(outputList)

def PolygonArea(corners):
    n = len(corners) # of corners
    area = 0.0
    for i in range(n):
        j = (i + 1) % n
        area += corners[i][0] * corners[j][1]
        area -= corners[j][0] * corners[i][1]
    area = abs(area) / 2.0
    return area

intersection = clip(rec1, rec2)
intersection_area = PolygonArea(intersection)
iou = intersection_area/(PolygonArea(rec1)+PolygonArea(rec2)-intersection_area)

Another slower method (don't know what algorithm) could be: 另一种较慢的方法(不知道哪种算法)可能是:

from shapely.geometry import Polygon

p1 = Polygon(rec1)
p2 = Polygon(rec2)
inter_sec_area = p1.intersection(rec2).area
iou = inter_sec_area/(p1.area + p2.area - inter_sec_area)

It is worth mentioning that in just one case with bigger polygons (not my case) the shapely module had an area twice greater than the first method. 值得一提的是,在只有一个较大多边形的情况下(不是我的情况), shapely模块的面积是第一种方法的两倍。 I didn't test both methods intensively. 我没有深入测试这两种方法。

This might help 这可能有帮助

What about using Pythagorean theorem ? 使用勾股定理怎么办? Since you have two rectangles, when they intersect, you will have one or more triangles, each with one angle of 90°. 由于您有两个矩形,当它们相交时,您将拥有一个或多个三角形,每个三角形的夹角为90°。

Since you also know the angle between the two rectangles (20° in my example), and the coordinates of each rectangle, you can use the the appropriate function (cos/sin/tan) to know the length of all the edges of the triangles. 由于您还知道了两个矩形之间的角度(在我的示例中为20°)和每个矩形的坐标,因此可以使用适当的函数(cos / sin / tan)来了解三角形所有边的长度。

I hope this can help 我希望这可以帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM