简体   繁体   中英

How to efficiently compute euclidean distance matrices without for loops in python?

I have a (51266,20,25,3) (N,F,J,C) matrix, where N is the example number, F is the frame number, J is the joint, and C is the xyz coordinates of the joint. I want to calculate the euclidean distance matrix for each frame in each example to have a matrix of dimensions (51266,20,25,25) My code is

from sklearn.metrics.pairwise import euclidean_distances as euc
from tqdm import tqdm
import numpy as np
Examples = np.load('allExamples.npy')
theEuclideanMethod = np.zeros((0,20,25,25))
for example in tqdm(range(Examples.shape[0])):
  euclideanBox = np.zeros((0,25,25))
  for frame in range(20):
    euclideanBox = np.concatenate((euclideanBox,euc(Examples[example,frame,:,:])[np.newaxis,...]),axis=0)

  euclideanBox = euclideanBox[np.newaxis,...]
  theEuclideanMethod = np.concatenate((theEuclideanMethod,euclideanBox))

np.save("Euclidean examples.npy",theEuclideanMethod)
print(theEuclideanMethod.shape,"Euclidean shape")  

The problem is I'm using for loops which are super slow. What are other ways I can modify my code to run faster?

You can use array broadcasting, like this:

import numpy as np

examples = np.random.uniform(size=(5, 6, 7, 3))
N, F, J, C = examples.shape

# deltas.shape == (N, F, J, J, C) - Cartesian deltas
deltas  = examples.reshape(N, F, J, 1, C) - examples.reshape(N, F, 1, J, C)

# distances.shape == (N, F, J, J)
distances = np.sqrt((deltas**2).sum(axis=-1), dtype=np.float32)

del deltas # release memory (only needed for interactive use)

This is a bit memory-hungry: with the values of N, F, J, C that you mentioned, the intermediate results ( deltas ) will take 16 GB, assuming double precision. It will be more efficient (6x less memory and better use of cache) if you preallocate the output array in single precision and loop over the N axis:

distances = np.empty((N, F, J, J))

for i, ex in enumerate(examples):
    # deltas.shape = (F, J, J, C) - Cartesian deltas
    deltas = ex.reshape(F, J, 1, C) - ex.reshape(F, 1, J, C)
    distances[i] = np.sqrt((deltas**2).sum(axis=-1))

This should run pretty fast. Float32 used to keep the memory usage low, but is optional. Adjust batch_size to be greater for increased speed or lower for less memory usage.

import numpy as np

# Adjust batch_size depending on your memory
batch_size = 500

# Make some fake data
x = np.random.randn(51266,20,25,3).astype(np.float32)
y = np.random.randn(51266,20,25,3).astype(np.float32)

# distance_matrix
d = np.empty(x.shape[:-1] + (x.shape[-2],), dtype=np.float32)
# Number of batches
N = (x.shape[0]-1) // batch_size + 1
for i in range(N):
    d[i*batch_size:(i+1)*batch_size] = np.sqrt(np.sum((
        x[i*batch_size:(i+1)*batch_size,:,:,None] - \
        y[i*batch_size:(i+1)*batch_size,:,None,:])**2, axis=-1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM