简体   繁体   English

Numpy:找到两个 3-D arrays 之间的欧氏距离

[英]Numpy: find the euclidean distance between two 3-D arrays

Given, two 3-D arrays of dimensions (2,2,2):给定两个维度为 (2,2,2) 的 3-D arrays:

A = [[[ 0,  0],
    [92, 92]],

   [[ 0, 92],
    [ 0, 92]]]

B = [[[ 0,  0],
    [92,  0]],

   [[ 0, 92],
    [92, 92]]]

How do you find the Euclidean distance for each vector in A and B efficiently?您如何有效地找到 A 和 B 中每个向量的欧氏距离?

I have tried for-loops but these are slow, and I'm working with 3-D arrays in the order of (>>2, >>2, 2).我已经尝试过 for 循环,但它们很慢,我正在按照 (>>2、>>2、2) 的顺序使用 3-D arrays。

Ultimately I want a matrix of the form:最终我想要一个形式的矩阵:

C = [[d1, d2],
     [d3, d4]]

Edit:编辑:

I've tried the following loop, but the biggest issue with it is that loses the dimensions I want to keep.我尝试了以下循环,但最大的问题是失去了我想要保留的尺寸。 But the distances are correct.但是距离是正确的。

[numpy.sqrt((A[row, col][0] - B[row, col][0])**2 + (B[row, col][1] -A[row, col][1])**2) for row in range(2) for col in range(2)]

Thinking in a NumPy vectorized way that would be performing element-wise differentiation, squaring and summing along the last axis and finally getting square root. 以NumPy向量化思维方式思考,该方法将执行元素分化,沿最后一个轴进行平方和求和,最后得到平方根。 So, the straight-forward implementation would be - 因此,直接的实施将是 -

np.sqrt(((A - B)**2).sum(-1))

We could perform the squaring and summing along the last axis in one go with np.einsum and thus make it more efficient, like so - 我们可以用np.einsum一次性地沿着最后一个轴执行平方和求和,从而使它更有效,就像这样 -

subs = A - B
out = np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

Another alternative with numexpr module - 使用numexpr模块的另一种选择 -

import numexpr as ne
np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

Since, we are working with a length of 2 along the last axis, we could just slice those and feed it to evaluate method. 因为,我们在最后一个轴上使用长度为2的长度,我们可以将它们切片并将其馈送到evaluate方法。 Please note that slicing isn't possible inside the evaluate string. 请注意,在evaluate字符串中无法进行切片。 So, the modified implementation would be - 因此,修改后的实施将是 -

a0 = A[...,0]
a1 = A[...,1]
b0 = B[...,0]
b1 = B[...,1]
out = ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Runtime test 运行时测试

Function definitions - 功能定义 -

def sqrt_sum_sq_based(A,B):
    return np.sqrt(((A - B)**2).sum(-1))

def einsum_based(A,B):
    subs = A - B
    return np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

def numexpr_based(A,B):
    return np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

def numexpr_based_with_slicing(A,B):
    a0 = A[...,0]
    a1 = A[...,1]
    b0 = B[...,0]
    b1 = B[...,1]
    return ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Timings - 计时 -

In [288]: # Setup input arrays
     ...: dim = 2
     ...: N = 1000
     ...: A = np.random.rand(N,N,dim)
     ...: B = np.random.rand(N,N,dim)
     ...: 

In [289]: %timeit sqrt_sum_sq_based(A,B)
10 loops, best of 3: 40.9 ms per loop

In [290]: %timeit einsum_based(A,B)
10 loops, best of 3: 22.9 ms per loop

In [291]: %timeit numexpr_based(A,B)
10 loops, best of 3: 18.7 ms per loop

In [292]: %timeit numexpr_based_with_slicing(A,B)
100 loops, best of 3: 8.23 ms per loop

In [293]: %timeit np.linalg.norm(A-B, axis=-1) #@dnalow's soln
10 loops, best of 3: 45 ms per loop

完整性:

np.linalg.norm(A-B, axis=-1)

I recommend being extremely careful when using custom squares and root instead of standard builtin math.hypot and np.hypot.我建议在使用自定义平方和根而不是标准内置 math.hypot 和 np.hypot 时要格外小心。 These are fast and optimized and very safe.这些都是快速和优化的,非常安全。

In that sense np.linalg.norm(AB, axis=-1) here looks safest从这个意义上说np.linalg.norm(AB, axis=-1)这里看起来最安全

For very large matrices, numpy using broadcast will become memory bound and slowdown.对于非常大的矩阵,使用广播的 numpy 将变成 memory 边界和减速。 In such cases you would want to use for loops but dont compromise on speed.在这种情况下,您会希望使用 for 循环,但不要在速度上妥协。 For that using numba will be good here因为在这里使用 numba 会很好

i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# 1.7320508075688773e+200

Refer: speed/overflow/underflow参考:速度/上溢/下溢

np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM