在3d数组中找到2d元素，这些元素类似于另一个3d数组中的2d元素

Question

I have two 3D arrays and want to identify 2D elements in one array, which have one or more similar counterparts in the other array. 我有两个3D数组，想要在一个数组中识别2D元素，在另一个数组中有一个或多个类似的对应物。

This works in Python 3: 这适用于Python 3：

import numpy as np
import random

np.random.seed(123)
A = np.round(np.random.rand(25000,2,2),2)
B = np.round(np.random.rand(25000,2,2),2)

a_index = np.zeros(A.shape[0])

for a in range(A.shape[0]):
    for b in range(B.shape[0]):
        if np.allclose(A[a,:,:].reshape(-1, A.shape[1]), B[b,:,:].reshape(-1, B.shape[1]),
                       rtol=1e-04, atol=1e-06):
            a_index[a] = 1
            break

np.nonzero(a_index)[0]

But of course this approach is awfully slow. 但当然这种方法非常慢。 Please tell me, that there is a more efficient way (and what it is). 请告诉我，有一种更有效的方式（以及它是什么）。 THX. 谢谢。

Answer 1

You are trying to do an all-nearest-neighbor type query. 您正在尝试执行所有最近邻居类型的查询。 This is something that has special O(n log n) algorithms, I'm not aware of a python implementation. 这是具有特殊O（n log n）算法的东西，我不知道python实现。 However you can use regular nearest-neighbor which is also O(n log n) just a bit slower. 但是你可以使用常规的最近邻居，它也是O（n log n）稍慢一点。 For example scipy.spatial.KDTree or cKDTree . 例如scipy.spatial.KDTree或cKDTree 。

import numpy as np
import random
np.random.seed(123)
A = np.round(np.random.rand(25000,2,2),2)
B = np.round(np.random.rand(25000,2,2),2)

import scipy.spatial
tree = scipy.spatial.cKDTree(A.reshape(25000, 4))
results = tree.query_ball_point(B.reshape(25000, 4), r=1e-04, p=1)

print [r for r in results if r != []]
# [[14252], [1972], [7108], [13369], [23171]]

query_ball_point() is not an exact equivalent to allclose() but it is close enough , especially if you don't care about the rtol parameter to allclose() . query_ball_point()不是完全等效于allclose()但它是足够接近 ，特别是如果你不关心rtol参数allclose() You also get a choice of metric ( p=1 for city block, or p=2 for Euclidean). 您还可以得到度量（选择p=1的城市街区，或p=2的欧几里得）。

PS Consider using query_ball_tree() for very large data sets. PS考虑将query_ball_tree()用于非常大的数据集。 Both A and B have to be indexed in that case. 在这种情况下，A和B都必须编入索引。

PS I'm not sure what effect the 2d-ness of the elements should have; PS我不确定元素的二维性应该有什么影响; the sample code I gave treats them as 1d and that is identical at least when using city block metric. 我给出的示例代码将它们视为1d，并且至少在使用城市街区指标时是相同的。

Answer 2

From the docs of np.allclose , we have : 从np.allclose的文档中，我们得到：

If the following equation is element-wise True, then allclose returns True. 如果以下等式是元素为True，则allclose返回True。

absolute(a - b) <= (atol + rtol * absolute(b)) 绝对值（a - b）<=（atol + rtol * absolute（b））

Using that criteria, we can have a vectorized implementation using broadcasting , customized for the stated problem, like so - 使用该标准，我们可以使用broadcasting进行矢量化实施，根据所述问题进行定制，如下所示 -

# Setup parameters
rtol,atol = 1e-04, 1e-06

# Use np.allclose criteria to detect true/false across all pairwise elements
mask = np.abs(A[:,None,] - B) <= (atol + rtol * np.abs(B))

# Use the problem context to get final output
out = np.nonzero(mask.all(axis=(2,3)).any(1))[0]

在3d数组中找到2d元素，这些元素类似于另一个3d数组中的2d元素

问题描述

2 个解决方案

解决方案1
1 已采纳 2016-01-28 09:34:02

解决方案2
0 2016-01-28 10:07:15

在3d数组中找到2d元素，这些元素类似于另一个3d数组中的2d元素

问题描述

2 个解决方案

解决方案1 1 已采纳 2016-01-28 09:34:02

解决方案2 0 2016-01-28 10:07:15

解决方案1
1 已采纳 2016-01-28 09:34:02

解决方案2
0 2016-01-28 10:07:15