计算两个点阵列之间成对角度的矩阵

Question

I have two vectors of points, x and y , shaped (n, p) and (m, p) respectively. 我有两个点矢量， x和y ，分别为(n, p)和(m, p) 。 As an example: 举个例子：

x = np.array([[ 0.     , -0.16341,  0.98656],
              [-0.05937, -0.25205,  0.96589],
              [ 0.05937, -0.25205,  0.96589],
              [-0.11608, -0.33488,  0.93508],
              [ 0.     , -0.33416,  0.94252]])
y = np.array([[ 0.     , -0.36836,  0.92968],
              [-0.12103, -0.54558,  0.82928],
              [ 0.12103, -0.54558,  0.82928]])

I want to compute an (n, m) -sized matrix that contains the angles between the two points, a la this question. 我想计算一个(n, m)大小的矩阵，它包含两个点之间的角度，这个问题。 That is, a vectorized version of: 也就是说，矢量化版本：

theta = np.array(
            [ np.arccos(np.dot(i, j) / (la.norm(i) * la.norm(j)))
                 for i in x for j in y ]
        ).reshape((n, m))

Note: n and m can be of the order of ~10000 each. 注意： n和m各为~10000。

Answer 1

There are multiple ways to do this: 有多种方法可以做到这一点：

import numpy.linalg as la
from scipy.spatial import distance as dist

# Manually
def method0(x, y):
    dotprod_mat = np.dot(x,  y.T)
    costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
    costheta /= la.norm(y, axis=1)
    return np.arccos(costheta)

# Using einsum
def method1(x, y):
    dotprod_mat = np.einsum('ij,kj->ik', x, y)
    costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
    costheta /= la.norm(y, axis=1)
    return np.arccos(costheta)

# Using scipy.spatial.cdist (one-liner)
def method2(x, y):
    costheta = 1 - dist.cdist(x, y, 'cosine')
    return np.arccos(costheta)

# Realize that your arrays `x` and `y` are already normalized, meaning you can
# optimize method1 even more
def method3(x, y):
    costheta = np.einsum('ij,kj->ik', x, y) # Directly gives costheta, since
                                            # ||x|| = ||y|| = 1
    return np.arccos(costheta)

Timing results for (n, m) = (1212, 252): （n，m）=（1212,252）的定时结果：

>>> %timeit theta = method0(x, y)
100 loops, best of 3: 11.1 ms per loop
>>> %timeit theta = method1(x, y)
100 loops, best of 3: 10.8 ms per loop
>>> %timeit theta = method2(x, y)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit theta = method3(x, y)
100 loops, best of 3: 9.42 ms per loop

The difference in timing reduces as the number of elements increases. 随着元件数量的增加，时序差异减小。 For (n, m) = (6252, 1212): 对于（n，m）=（6252,1212）：

>>> %timeit -n10 theta = method0(x, y)
10 loops, best of 3: 365 ms per loop
>>> %timeit -n10 theta = method1(x, y)
10 loops, best of 3: 358 ms per loop
>>> %timeit -n10 theta = method2(x, y)
10 loops, best of 3: 384 ms per loop
>>> %timeit -n10 theta = method3(x, y)
10 loops, best of 3: 314 ms per loop

However, if you leave out the np.arccos step, ie, suppose you could manage with just costheta , and didn't need theta itself, then: 但是，如果你忽略了np.arccos步骤，即假设你只能用costheta管理，并且不需要 theta本身，那么：

>>> %timeit costheta = np.einsum('ij,kj->ik', x, y)
10 loops, best of 3: 61.3 ms per loop
>>> %timeit costheta = 1 - dist.cdist(x, y, 'cosine')
10 loops, best of 3: 124 ms per loop
>>> %timeit costheta = dist.cdist(x, y, 'cosine')
10 loops, best of 3: 112 ms per loop

This is for the case of (6252, 1212). 这是针对（6252,1212）的情况。 So actually np.arccos is taking up 80% of the time. 所以实际上np.arccos占80％的时间。 In this case I find that np.einsum is much faster than dist.cdist . 在这种情况下，我发现np.einsum比快得多 dist.cdist 。 So you definitely want to be using einsum . 所以你肯定想要使用einsum 。

Summary: Results for theta are largely similar, but np.einsum is fastest for me, especially when you're not extraneously computing the norms. 总结： theta结果大致相似，但np.einsum对我来说最快，特别是当你没有无关地计算规范时。 Try to avoid computing theta and working with just costheta . 尽量避免计算theta并使用costheta 。

Note: An important point I didn't mention is that finiteness of floating-point precision can cause np.arccos to give nan values. 注意：我没有提到的一个重点是浮点精度的有限性会导致np.arccos给出nan值。 method[0:3] worked for values of x and y that hadn't been properly normalized, naturally. method[0:3]自然地适用于未正确归一化的x和y值。 But method3 gave a few nan s. 但method3给了几个nan秒。 I fixed this with pre-normalization, which naturally destroys any gain in using method3 , unless you need to do this computation many many times for a small set of pre-normalized matrices (for whatever reason). 我用预标准化来修复它，这自然会破坏使用method3任何增益，除非你需要对一method3预标准化矩阵进行多次计算（无论出于何种原因）。

计算两个点阵列之间成对角度的矩阵

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-01-12 07:46:16

计算两个点阵列之间成对角度的矩阵

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-01-12 07:46:16

解决方案1
3 已采纳 2016-01-12 07:46:16