numpy.linalg.norm VS scipy cdist for L2 norm

Question

Very in advance apologies for my basic question!

Given :

a = np.random.rand(6, 3)
b = np.random.rand(6, 3)

Using scipy.spatial.distance.cdist and d = cdist(a, b, 'euclidean') , results in:

[[0.8625803  0.29814357 0.97548993 0.84368212 0.66530478 0.95367553]
 [0.67858887 0.27603821 0.76236585 0.80857596 0.48560167 0.84517836]
 [0.53097997 0.41061975 0.66475479 0.54243987 0.47469843 0.70178229]
 [0.37678898 0.7855905  0.25492161 0.79870147 0.37795642 0.58136674]
 [0.73515058 0.90614048 0.88997676 0.15126486 0.82601188 0.63733843]
 [0.34345477 0.7927319  0.52963369 0.27127254 0.64808932 0.66528862]]

But d = np.linalg.norm(a - b, axis=1) , returns only the diagonal of scipy answer:

[0.8625803  0.27603821 0.66475479 0.79870147 0.82601188 0.66528862]

Question :

Is it possible to get the result of scipy.spatial.distance.cdist using only np.linalg.norm or numpy ?

Answer 1

You can use numpy broadcasting as follows:

d = np.linalg.norm(a[:, None, :] - b[None, :,  :], axis=2)

Performace should be similar to scipy.spatial.distance.cdist , in my local machine:

%timeit np.linalg.norm(a[:, None, :] - b[None, :,  :], axis=2)
13.5 µs ± 1.71 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit cdist(a,b)
15 µs ± 236 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

numpy.linalg.norm VS scipy cdist for L2 norm

Question

Given :

Question :

1 answers

solution1
1 ACCPTED 2020-03-29 15:33:11

numpy.linalg.norm VS scipy cdist for L2 norm

Question

Given :

Question :

1 answers

solution1 1 ACCPTED 2020-03-29 15:33:11

solution1
1 ACCPTED 2020-03-29 15:33:11