How to match closest arrays in two lists in Python?

Question

I'm detecting some objects on 2400x3000 images. The size are the same for every image. But each image is slightly different (maybe a bit rotated, skewed, etc.). Therefore, I want to use the first image's detected bounding boxes' coordinates as a reference to other ones.

So I want to find the best matches between to arrays of coordinates of detected bounding boxes. Here's an example:

[187.0, 489.0, 1501.0, 575.0]
[1810.0, 1967.0, 1917.0, 2052.0]
[1360.0, 2187.0, 1467.0, 2275.0]
[1256.0, 2188.0, 1361.0, 2276.0]
[506.0, 2197.0, 615.0, 2284.0]
[199.0, 2288.0, 306.0, 2372.0]

This is the first report of coordinates. The other one is:

[200.0, 490.0, 1491.0, 588.0]
[1813.0, 1966.0, 1919.0, 2053.0]
[1370.0, 2188.0, 1473.0, 2276.0]
[1265.0, 2189.0, 1365.0, 2275.0]
[520.0, 2200.0, 629.0, 2288.0]
[222.0, 2291.0, 327.0, 2376.0]

As you realize, this is matched already. Every row in first report corresponds to the second one.

But I want to do it automatically. I want Python to find the closest array and match those.

I've looked at algorithms for array comparison, list comparison etc. but I feel like my case is different since they are not random numbers but coordinates.

Anybody have an idea how to make this happen?

For example:

[200.0, 490.0, 1491.0, 588.0]

should match with:

[187.0, 489.0, 1501.0, 575.0]

OR This array:

[1813.0, 1966.0, 1919.0, 2053.0]

should match with:

[1810.0, 1967.0, 1917.0, 2052.0]

Thanks in advance.

EDIT: To be more specific, I need to sort two lists based on matches.

TL;DR SOLUTION: Please look at James' answer, it works like a charm!

Answer 1

You can use k-nearest neighbors with the number of neighbors set to 1. Each example in your first set of bounding boxes will correspond to its own class. After fitting the model, you can predict which bounding boxes are closest to the first set.

from sklearn.neighbors import KNeighborsClassifier
import numpy as np

d1 = np.array(
  [[187.0, 489.0, 1501.0, 575.0],
  [1810.0, 1967.0, 1917.0, 2052.0],
  [1360.0, 2187.0, 1467.0, 2275.0],
  [1256.0, 2188.0, 1361.0, 2276.0],
  [506.0, 2197.0, 615.0, 2284.0],
  [199.0, 2288.0, 306.0, 2372.0]]
)

d2 = np.array(
  [[200.0, 490.0, 1491.0, 588.0],
  [1813.0, 1966.0, 1919.0, 2053.0],
  [1370.0, 2188.0, 1473.0, 2276.0],
  [1265.0, 2189.0, 1365.0, 2275.0],
  [520.0, 2200.0, 629.0, 2288.0],
  [222.0, 2291.0, 327.0, 2376.0]]
)

classes = np.arange(len(d1))

knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(d1, y=classes)
knn.predict(d1)
# returns:
array([0, 1, 2, 3, 4, 5])

As you noted, this already matches the order of the first set of bounding boxes. However, if we look at a more randomized set of data:

d3 = np.array([[ 524.0, 2182.0,  632.0, 2294.0],
  [1368.0, 2173.0, 1471.0, 2287.0],
  [ 182.0,  474.0, 1473.0,  605.0],
  [1797.0, 1975.0, 1930.0, 2055.0],
  [1281.0, 2202.0, 1356.0, 2263.0],
  [ 227.0, 2295.0,  339.0, 2394.0]]
)

matches = knn.predict(d3)
matches
# returns:
array([4, 2, 0, 1, 3, 5])

To actually use the closest matches, we can use argsort to re-order the d3 array so it aligns with the classes of d1 . We use argsort on the predicted classes, matches , to get an index array that would sort the classes. Using the index array then properly sorts d3 to match d1

d3_sorted = d3[np.argsort(matches)]

d3_sorted
# returns:
array([[ 182.,  474., 1473.,  605.],
       [1797., 1975., 1930., 2055.],
       [1368., 2173., 1471., 2287.],
       [1281., 2202., 1356., 2263.],
       [ 524., 2182.,  632., 2294.],
       [ 227., 2295.,  339., 2394.]])

Answer 2

Please change variable closest_percentage in the below function to define closeness

Here you I am sending closest percentage is 10 by default that means elements should be close by 10 percent margin

def closest_arrays(lst1,lst2,closest_percentage=5.0):

if(len(lst1)!=len(lst2)):

    return False;

else:

    counter=1

    for i in range(len(lst1)):

        if (((abs(lst1[i])-abs(lst2[i]))/(abs(lst1[i]))*100) > closest_percentage):

            return False;

        
return True;

if all elements are less than or equal to closest_percentage ( by default 5 percent) then return True

How to match closest arrays in two lists in Python?

Question

2 answers

solution1
1 ACCPTED 2020-11-22 13:27:36

solution2
0 2020-11-22 13:44:57

How to match closest arrays in two lists in Python?

Question

2 answers

solution1 1 ACCPTED 2020-11-22 13:27:36

solution2 0 2020-11-22 13:44:57

solution1
1 ACCPTED 2020-11-22 13:27:36

solution2
0 2020-11-22 13:44:57