[英]How can I optimize the distance between 2 points (x,y,z) and two arrays
I need to calculate the distance between each pixcel and each centroid.我需要计算每个像素和每个质心之间的距离。
Arguments: Arguments:
Returns:回报:
def distance(X, C):
dist = numpy.empty((X.shape[0], C.shape[0]))
for i,x in enumerate(X):
for y,c in enumerate(C):
dist[i][y] = euclidean_dist(x,c)
return dist
def euclidean_dist(x, y):
x1, y1, z1 = x
x2, y2, z2 = y
return math.sqrt((x1-x2)**2 + (y1-y2)**2 + (z1-z2)**2)
If you can add scipy dependency, then this is already implemented in scipy.spatial.distance.cdist .如果您可以添加 scipy 依赖项,那么这已经在scipy.spatial.distance.cdist中实现。 Otherwise we can use numpy.broadcasting and numpy.linalg.norm :
否则我们可以使用numpy.broadcasting和numpy.linalg.norm :
Scipy Implemenation Scipy 实现
from scipy.spatial import distance
distance.cdist(X, C, 'euclidean')
Numpy Implementation Numpy 实现
import numpy as np
np.linalg.norm(X[:,None,:] - C, axis=2)
Performance表现
P = 100_000
K = 10_00
D = 3
X = np.random.randint(0,10, (P,D))
C = np.random.randint(0,10, (K,D))
%timeit distance.cdist(X, C, 'euclidean')
1.06 s ± 57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.linalg.norm(X[:,None,:] - C, axis=2)
15 s ± 2.18 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
We can see that for large sizes of X
and C
scipy implementation is way faster.我们可以看到,对于大尺寸的
X
和C
scipy 的实现要快得多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.