简体   繁体   中英

Efficient way to find overlapping of N rectangles

I am trying to find an efficient solution for finding overlapping of n rectangles where rectangles are stored in two separate lists. We are looking for all rectangles in listA that overlap with rectangles in listB (and vice versa). Comparing one element from the first list to second list could take immensely large amount of time. I am looking for an efficient solution.

I have two list of rectangles

rect = Rectangle(10, 12, 56, 15)
rect2 = Rectangle(0, 0,1, 15)
rect3 = Rectangle (10,  12, 56, 15)

listA = [rect, rect2]
listB = [rect3]

which is created from the class:

import numpy as np
import itertools as it

class  Rectangle(object):
    def __init__(self, left, right, bottom, top):
        self.left = left
        self.bottom = right
        self.right = bottom
        self.top = top

    def overlap(r1, r2):
        hoverlaps = True
        voverlaps = True
        if (r1.left > r2.right) or (r1.right < r2.left):
            hoverlaps = False
        if (r1.top < r2.bottom) or (r1.bottom > r2.top):
            voverlaps = False
        return hoverlaps and voverlaps

I need to compare rectangle in listA to listB the code goes like this which is highly inefficient - comparing one by one

for a in it.combinations(listB):
    for b in it.combinations(listA):
        if a.overlap(b):

Any better efficient method to deal with the problem?

First off: As with many a problem from computational geometry , specifying the parameters for order-of-growth analysis needs care: calling the lengths of the lists m and n , the worst case in just those parameters is Ω(m×n) , as all areas might overlap (in this regard, the algorithm from the question is asymptotically optimal ). It is usual to include the size of the output: t = f(m, n, o) ( Output-sensitive algorithm ).
Trivially, f ∈ Ω(m+n+o) for the problem presented.


Line Sweep is a paradigm to reduce geometrical problems by one dimension - in its original form, from 2D to 1D, plane to line.

Imagine all the rectangles in the plane, different colours for the lists.
Now sweep a line across this plane - left to right, conventionally, and infinitesimally further to the right "for low y-coordinates" (handle coordinates in increasing x -order, increasing y -order for equal x ).
For all of this sweep (or scan ), per colour keep one set representing the "y-intervals" of all rectangles at the current x-coordinate, starting empty. (In a data structure supporting insertion, deletion, and enumerating all intervals that overlap a query interval : see below.)
Meeting the left side of a rectangle, add the segment to the data structure for its colour. Report overlapping intervals/rectangles in any other colour.
At a right side, remove the segment.
Depending on the definition of "overlapping", handle left sides before right sides - or the other way round.


There are many data structures supporting insertion and deletion of intervals, and finding all intervals that overlap a query interval . Currently, I think Augmented Search-Trees may be easiest to understand, implement, test, analyse…
Using this, enumerating all o intersecting pairs of axis-aligned rectangles (a, b) from listA and listB should be possible in O((m+n)log(m+n)+o) time and O(m+n) space. For sizeable problem instances, avoid data structures needing more than linear space ((original) Segment Trees , for one example pertaining to interval overlap).


Another paradigm in algorithm design is Divide&Conquer : with a computational geometry problem, choose one dimension in which the problem can be divided into independent parts, and a coordinate such that the sub-problems for "coordinates below" and "coordinates above" are close in expected run-time. Quite possibly, another (and different) sub-problem "including the coordinate" needs to be solved. This tends to be beneficial when a) the run-time for solving sub-problems is "super-log-linear", and b) there is a cheap (linear) way to construct the overall solution from the solutions for the sub-problems.
This lends itself to concurrent problem solving, and can be used with any other approach for sub-problems, including line sweep.


There will be many ways to tweak each approach, starting with disregarding input items that can't possibly contribute to the output. To "fairly" compare implementations of algorithms of like order of growth, don't aim for a fair "level of tweakedness": try to invest fair amounts of time for tweaking.

A couple of potential minor efficiency improvements. First, fix your overlap() function, it potentially does calculations it needn't:

def overlap(r1, r2):

    if r1.left > r2.right or r1.right < r2.left:
        return False

    if r1.top < r2.bottom or r1.bottom > r2.top:
        return False

    return True

Second, calculate the contaning rectangle for one of the lists and use it to screen the other list -- any rectangle that doesn't overlap the container doesn't need to be tested against all the rectangles that contributed to it:

def containing_rectangle(rectangles):
    return Rectangle(min(rectangles, key=lambda r: r.left).left,
        max(rectangles, key=lambda r: r.right).right,
        min(rectangles, key=lambda r: r.bottom).bottom,
        max(rectangles, key=lambda r: r.top).top
    )

c = containing_rectangle(listA)

for b in listB:
    if b.overlap(c):
        for a in listA:
            if b.overlap(a):

In my testing with hundreds of random rectangles, this avoided comparisons on the order of single digit percentages (eg 2% or 3%) and occasionally increased the number of comparisons. However, presumably your data isn't random and might fare better with this type of screening.

Depending on the nature of your data, you could break this up into a container rectangle check for each batch of 10K rectangles out of 50K or what ever slice gives you maximum efficiency. Possibly presorting the rectangles (eg by their centers) before assigning them to container batches.

We can break up and batch both lists with container rectangles:

listAA = [listA[x:x + 10] for x in range(0, len(listA), 10)]

for i, arrays in enumerate(listAA):
    listAA[i] = [containing_rectangle(arrays)] + arrays

listBB = [listB[x:x + 10] for x in range(0, len(listB), 10)]

for i, arrays in enumerate(listBB):
    listBB[i] = [containing_rectangle(arrays)] + arrays

for bb in listBB:
    for aa in listAA:
        if bb[0].overlap(aa[0]):
            for b in bb[1:]:
                if b.overlap(aa[0]):
                    for a in aa[1:]:
                        if b.overlap(a):

With my random data, this decreased the comparisons on the order of 15% to 20%, even counting the container rectangle comparisons. The batching of rectangles above is arbitrary and you can likely do better.

The exception you're getting comes from the last line of the code you show. The expression list[rect] is not valid, since list is a class, and the [] syntax in that context is trying to index it. You probably want just [rect] (which creates a new list containing the single item rect ).

There are several other basic issues, with your code. For instance, your Rect.__init__ method doesn't set a left attribute, which you seem to expect in your collision testing method. You've also used different capitalization for r1 and r2 in different parts of the overlap method (Python doesn't consider r1 to be the same as R1 ).

Those issues don't really have anything to do with testing more than two rectangles, which your question asks about. The simplest way to do that (and I strongly advise sticking to simple algorithms if you're having basic issues like the ones mentioned above), is to simply compare each rectangle with each other rectangle using the existing pairwise test. You can use itertools.combinations to easily get all pairs of items from an iterable (like a list):

list_of_rects = [rect1, rect2, rect3, rect4] # assume these are defined elsewhere

for a, b in itertools.combinations(list_of_rects, 2):
    if a.overlap(b):
        # do whatever you want to do when two rectangles overlap here

This implementation using numpy is about 35-40 times faster according to a test I did. For 2 lists each with 10000 random rectangles this method took 2.5 secs and the method in the question took ~90 sec. In terms of complexity it's still O(N^2) like the method in the question.

import numpy as np

rects1=[
    [0,10,0,10],
    [0,100,0,100],
]

rects2=[
    [20,50,20,50],
    [200,500,200,500],
    [0,12,0,12]
]

data=np.asarray(rects2)


def find_overlaps(rect,data):
    data=data[data[::,0]<rect[1]]
    data=data[data[::,1]>rect[0]]
    data=data[data[::,2]<rect[3]]
    data=data[data[::,3]>rect[2]]
    return data


for rect in rects1:
    overlaps = find_overlaps(rect,data)
    for overlap in overlaps:
        pass#do something here

Obviously, if your list (at least listB) is sorted by r2.xmin, you can search for r1.xmax in listB and stop testing overlap of r1 in this listB (the rest will be to the right). This will be O(n·log(n)).

A sorted vector has faster access than a sorted list.

I'm supposing that the rectangles edges are oriented same as axis.

Also fix your overlap() function as cdlane explained.

If you know the upper and lower limits for coordinates, you can narrow the search by partitioning the coordinate space into squares eg 100x100.

  • Make one "set" per coordinate square.
  • Go through all squares, putting them in the "set" of any square they overlap.

See also Tiled Rendering which uses partitions to speed up graphical operations.

    // Stores rectangles which overlap (x, y)..(x+w-1, y+h-1)
    public class RectangleSet
    {
       private List<Rectangle> _overlaps;

       public RectangleSet(int x, int y, int w, int h);
    }

    // Partitions the coordinate space into squares
    public class CoordinateArea
    {
       private const int SquareSize = 100;

       public List<RectangleSet> Squares = new List<RectangleSet>();

       public CoordinateArea(int xmin, int ymin, int xmax, int ymax)
       {
          for (int x = xmin; x <= xmax; x += SquareSize)
          for (int y = ymin; y <= ymax; y += SquareSize)
          {
              Squares.Add(new RectangleSet(x, y, SquareSize, SquareSize);
          }
       }

       // Adds a list of rectangles to the coordinate space
       public void AddRectangles(IEnumerable<Rectangle> list)
       {
          foreach (Rectangle r in list)
          {
              foreach (RectangleSet set in Squares)
              {
                  if (r.Overlaps(set))
                      set.Add(r);
              }
          }
       }
    }

Now you have a much smaller set of rectangles for comparison, which should speed things up nicely.

CoordinateArea A = new CoordinateArea(-500, -500, +1000, +1000);
CoordinateArea B = new CoordinateArea(-500, -500, +1000, +1000);  // same limits for A, B

A.AddRectangles(listA);
B.AddRectangles(listB);

for (int i = 0; i < listA.Squares.Count; i++)
{
    RectangleSet setA = A[i];
    RectangleSet setB = B[i];

    // *** small number of rectangles, which you can now check thoroghly for overlaps ***

}

I think you have to setup an additional data structure (spatial index) in order to have fast access to nearby rectangles that potentially overlap in order to reduce the time complexity from quadratic to linearithmic.

See also:

Here is what I use to calculate overlap areas of many candidate rectangles (with candidate_coords [[l, t, r, b], ...]) with a target one (target_coords [l, t, r, b]):

comb_tensor = np.zeros((2, candidate_coords.shape[0], 4))

comb_tensor[0, :] = target_coords
comb_tensor[1] = candidate_coords

dx = np.amin(comb_tensor[:, :, 2].T, axis=1) - np.amax(comb_tensor[:, :, 0].T, axis=1)
dy = np.amin(comb_tensor[:, :, 3].T, axis=1) - np.amax(comb_tensor[:, :, 1].T, axis=1)

dx[dx < 0] = 0
dy[dy < 0] = 0 

overlap_areas = dx * dy

This should be fairly efficient especially if there are many candidate rectangles as all is done using numpy functions operating on ndarrays. You can either do a loop calculating the overlap areas or perhaps add one more dimension to comb_tensor.

I think the below code will be useful.

print("Identifying Overlap between n number of rectangle")
#List to be used in set and get_coordinate_checked_list
coordinate_checked_list = []

def get_coordinate_checked_list():
    #returns the overlapping coordinates of rectangles
    """
    :return: list of overlapping coordinates
    """
    return coordinate_checked_list

def set_coordinate_checked_list(coordinates):
    #appends the overlapping coordinates of rectangles
    """
    :param coordinates: list of overlapping coordinates to be appended in coordinate_checked_list
    :return:
    """
    coordinate_checked_list.append(coordinates)

def overlap_checked_for(coordinates):
    # to find rectangle overlap is already checked, if checked "True" will be returned else coordinates will be added
    # to coordinate_checked_list and return "False"
    """
    :param coordinates: coordinates of two rectangles
    :return: True if already checked, else False
    """
    if coordinates in get_coordinate_checked_list():
        return True
    else:
        set_coordinate_checked_list(coordinates)
        return False

def __isRectangleOverlap(R1, R2):
    #checks if two rectangles overlap
    """
    :param R1: Rectangle1 with cordinates [x0,y0,x1,y1]
    :param R2: Rectangle1 with cordinates [x0,y0,x1,y1]
    :return: True if rectangles overlaps else False
    """
    if (R1[0] >= R2[2]) or (R1[2] <= R2[0]) or (R1[3] <= R2[1]) or (R1[1] >= R2[3]):
        return False
    else:
        print("Rectangle1 {} overlaps with Rectangle2 {}".format(R1,R2))
        return True


def __splitByHeightandWidth(rectangles):
    # Gets the list of rectangle, divide the paged with respect to height and width and position
    # the rectangle in suitable section say left_up,left_down,right_up,right_down and returns the list of rectangle
    # grouped with respect to section

    """
    :param rectangles: list of rectangle coordinates each designed as designed as [x0,y0,x1,y1]
    :return:list of rectangle grouped with respect to section, suspect list which holds the rectangles
            positioned in more than one section
    """

    lu_Rect = []
    ll_Rect = []
    ru_Rect = []
    rl_Rect = []
    sus_list = []
    min_h = 0
    max_h = 0
    min_w = 0
    max_w = 0
    value_assigned = False
    for rectangle in rectangles:
        if not value_assigned:
            min_h = rectangle[1]
            max_h = rectangle[3]
            min_w = rectangle[0]
            max_w = rectangle[2]
        value_assigned = True
        if rectangle[1] < min_h:
            min_h = rectangle[1]
        if rectangle[3] > max_h:
            max_h = rectangle[3]
        if rectangle[0] < min_w:
            min_w = rectangle[0]
        if rectangle[2] > max_w:
            max_w = rectangle[2]

    for rectangle in rectangles:
        if rectangle[3] <= (max_h - min_h) / 2:
            if rectangle[2] <= (max_w - min_w) / 2:
                ll_Rect.append(rectangle)
            elif rectangle[0] >= (max_w - min_w) / 2:
                rl_Rect.append(rectangle)
            else:
                # if rectangle[0] < (max_w - min_w) / 2 and rectangle[2] > (max_w - min_w) / 2:
                ll_Rect.append(rectangle)
                rl_Rect.append(rectangle)
                sus_list.append(rectangle)
        if rectangle[1] >= (max_h - min_h) / 2:
            if rectangle[2] <= (max_w - min_w) / 2:
                lu_Rect.append(rectangle)
            elif rectangle[0] >= (max_w - min_w) / 2:
                ru_Rect.append(rectangle)
            else:
                # if rectangle[0] < (max_w - min_w) / 2 and rectangle[2] > (max_w - min_w) / 2:
                lu_Rect.append(rectangle)
                ru_Rect.append(rectangle)
                sus_list.append(rectangle)
        if rectangle[1] < (max_h - min_h) / 2 and rectangle[3] > (max_h - min_h) / 2:
            if rectangle[0] < (max_w - min_w) / 2 and rectangle[2] > (max_w - min_w) / 2:
                lu_Rect.append(rectangle)
                ll_Rect.append(rectangle)
                ru_Rect.append(rectangle)
                rl_Rect.append(rectangle)
                sus_list.append(rectangle)
            elif rectangle[2] <= (max_w - min_w) / 2:
                lu_Rect.append(rectangle)
                ll_Rect.append(rectangle)
                sus_list.append(rectangle)
            else:
                # if rectangle[0] >= (max_w - min_w) / 2:
                ru_Rect.append(rectangle)
                rl_Rect.append(rectangle)
                sus_list.append(rectangle)
    return [lu_Rect, ll_Rect, ru_Rect, rl_Rect], sus_list

def find_overlap(rectangles):
    #Find all possible overlap between the list of rectangles
    """
    :param rectangles: list of rectangle grouped with respect to section
    :return:
    """
    split_Rectangles , sus_list = __splitByHeightandWidth(rectangles)
    for section in split_Rectangles:
        for rect in range(len(section)-1):
            for i in range(len(section)-1):
                if section[0] and section[i+1] in sus_list:
                    if not overlap_checked_for([section[0],section[i+1]]):
                        __isRectangleOverlap(section[0],section[i+1])
                else:
                    __isRectangleOverlap(section[0],section[i+1])
            section.pop(0)

arr =[[0,0,2,2],[0,0,2,7],[0,2,10,3],[3,0,4,1],[6,1,8,8],[0,7,2,8],[4,5,5,6],[4,6,10,7],[9,3,10,5],[5,3,6,4],[4,3,6,5],[4,3,5`enter code here`,6]]
find_overlap(arr)

For a simple solution that improves on pure brute force if the rectangles are relatively sparse:

  • sort all Y ordinates in a single list, and for every ordinate store the index of the rectangle, the originating list and a flag to distinguish bottom and top;

  • scan the list from bottom to top, maintaining two "active lists", one per rectangle set;

    • when you meet a bottom, insert the rectangle index in its active list and compare to all rectangles in the other list to detect overlaps on X;

    • when you meet a top, remove the rectangle index from its active list.

Assuming simple linear lists, the updates and searches will take time linear in the size of the active lists. So instead of M x N comparisons, you will perform M xn + mx N comparisons, where m and n denote the average list sizes. (If the rectangles do not overlap within their set, one can expect an average list length not exceeding √M and √N.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM