I need to calculate the distance between each pixcel and each centroid.
Arguments:
Returns:
def distance(X, C):
dist = numpy.empty((X.shape[0], C.shape[0]))
for i,x in enumerate(X):
for y,c in enumerate(C):
dist[i][y] = euclidean_dist(x,c)
return dist
def euclidean_dist(x, y):
x1, y1, z1 = x
x2, y2, z2 = y
return math.sqrt((x1-x2)**2 + (y1-y2)**2 + (z1-z2)**2)
If you can add scipy dependency, then this is already implemented in scipy.spatial.distance.cdist . Otherwise we can use numpy.broadcasting and numpy.linalg.norm :
Scipy Implemenation
from scipy.spatial import distance
distance.cdist(X, C, 'euclidean')
Numpy Implementation
import numpy as np
np.linalg.norm(X[:,None,:] - C, axis=2)
Performance
P = 100_000
K = 10_00
D = 3
X = np.random.randint(0,10, (P,D))
C = np.random.randint(0,10, (K,D))
%timeit distance.cdist(X, C, 'euclidean')
1.06 s ± 57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.linalg.norm(X[:,None,:] - C, axis=2)
15 s ± 2.18 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
We can see that for large sizes of X
and C
scipy implementation is way faster.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.