简体   繁体   中英

Compute distances between all points in array efficiently using Python

I have a list of N=3 points like this as input: points = [[1, 1], [2, 2], [4, 4]]

I wrote this code to compute all possible distances between all elements of my list points , as dist = min(∣x1−x2∣,∣y1−y2∣) :

distances = []
for i in range(N-1):
    for j in range(i+1,N):
        dist = min((abs(points[i][0]-points[j][0]), abs(points[i][1]-points[j][1])))
        distances.append(dist)
print(distances)

My output will be the array distances with all the distances saved in it: [1, 3, 2]

It works fine with N=3 , but I would like to compute it in a more efficiently way and be free to set N=10^5 . I am trying to use also numpy and scipy , but I am having a little trouble with replacing the loops and use the correct method.

Can anybody help me please? Thanks in advance

The numpythonic solution

To compute your distances using the full power of Numpy , and do it substantially faster:

  1. Convert your points to a Numpy array:

     pts = np.array(points)
  2. Then run:

     dist = np.abs(pts[np.newaxis, :, :] - pts[:, np.newaxis, :]).min(axis=2)

Here the result is a square array. But if you want to get a list of elements above the diagonal, just like your code generates, you can run:

dist2 = dist[np.triu_indices(pts.shape[0], 1)].tolist()

I ran this code for the following 9 points:

points = [[1, 1], [2, 2], [4, 4], [3, 5], [2, 8], [4, 10], [3, 7], [2, 9], [4, 7]]

For the above data, the result saved in dist (a full array) is:

array([[0, 1, 3, 2, 1, 3, 2, 1, 3],
       [1, 0, 2, 1, 0, 2, 1, 0, 2],
       [3, 2, 0, 1, 2, 0, 1, 2, 0],
       [2, 1, 1, 0, 1, 1, 0, 1, 1],
       [1, 0, 2, 1, 0, 2, 1, 0, 1],
       [3, 2, 0, 1, 2, 0, 1, 1, 0],
       [2, 1, 1, 0, 1, 1, 0, 1, 0],
       [1, 0, 2, 1, 0, 1, 1, 0, 2],
       [3, 2, 0, 1, 1, 0, 0, 2, 0]])

and the list of elements from upper diagonal part is:

[1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 0, 2, 1, 0, 2, 1, 2, 0, 1, 2, 0, 1, 1, 0, 1, 1,
  2, 1, 0, 1, 1, 1, 0, 1, 0, 2]

How faster is my code

It turns out that even for such small sample like I used ( 9 points), my code works 2 times faster . For a sample of 18 points (not presented here) - 6 times faster.

This difference in speed has been gained even though my function computes "2 times more than needed" ie it generates a full array, whereas the lower diagonal part of the result in a "mirror view" of the upper diagonal part (what computes your code).

For bigger number of points the difference should be much bigger. Make your test on a bigger sample of points (say 100 points) and write how many times faster was my code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM