简体   繁体   中英

Calculate Euclidean distance between two python arrays

I want to write a function to calculate the Euclidean distance between coordinates in list_a to each of the coordinates in list_b , and produce an array of distances of dimension a rows by b columns (where a is the number of coordinates in list_a and b is the number of coordinates in list_b .

NB: I do not want to use any libraries other than numpy, for simplicity.

list_a = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
list_b = np.array([[0,1],[5,4]])

Running the function would generate:

>>> np.array([[0., 5.830951894845301],
              [2.236, 3.605551275463989],
              [5.830951894845301, 0.],
              [5.830951894845301, 2.8284271247461903],
              [4.123105625617661, 2.23606797749979]])

I have been trying to run the below

def run_euc(list_a,list_b):
    euc_1 = [np.subtract(list_a, list_b)]
    euc_2 = sum(sum([i**2 for i in euc_1]))
    return np.sqrt(euc_2)

But I am getting the following error:

ValueError: operands could not be broadcast together with shapes (5,2) (2,2)

Thank you.

I wonder what is stopping you from using Scipy. Since you are anyway using numpy, perhaps you can try using Scipy, which is not so heavy.

Why?
It has many mathematical functions with efficient implementations to make good use of your computing power.

With that in mind, here is a distance_matrix function exactly for the purpose you've mentioned.

Concretely, it takes your list_a (mxk matrix) and list_b (nxk matrix) and outputs mxn matrix with p-norm (p=2 for euclidean) distance between each pair of points across the two matrices.

from scipy.spatial import distance_matrix
distances = distance_matrix(list_a, list_b)

Here, you can just use np.linalg.norm to compute the Euclidean distance. Your bug is due to np.subtract is expecting the two inputs are of the same length.

import numpy as np

list_a = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
list_b = np.array([[0,1],[5,4]])

def run_euc(list_a,list_b):
    return np.array([[ np.linalg.norm(i-j) for j in list_b] for i in list_a])

print(run_euc(list_a, list_b))

The code produces:

[[0.         5.83095189]
 [2.23606798 3.60555128]
 [5.83095189 0.        ]
 [5.83095189 2.82842712]
 [4.12310563 2.23606798]]

I think this works

  import numpy as np
  def distance(x,y):
      x=np.array(x)
      y=np.array(y)
      p=np.sum((x-y)**2)
      d=np.sqrt(p)
      return d

I hope this answers the question but this is a repeat of; Minimum Euclidean distance between points in two different Numpy arrays, not within

# Import package
import numpy as np

# Define unequal matrices
xy1 = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
xy2 = np.array([[0,1],[5,4]])

P = np.add.outer(np.sum(xy1**2, axis=1), np.sum(xy2**2, axis=1))
N = np.dot(xy1, xy2.T)
dists = np.sqrt(P - 2*N)
print(dists)

Another way you can do this is:

np.array(
[np.sqrt((list_a[:,1]-list_b[i,1])**2+(list_a[:,0]-list_b[i,0])**2) for i in range(len(list_b))]
).T

Output:

array([[0.        , 5.83095189],
       [2.23606798, 3.60555128],
       [5.83095189, 0.        ],
       [5.83095189, 2.82842712],
       [4.12310563, 2.23606798]])

This code can be written in much more simpler and efficient way,so if you find anything that could be improved in the code,please let me know in the comment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM