简体   繁体   English

获取Numpy 2d数组相交行的索引

[英]Get indices of intersecting rows of Numpy 2d Array

I want to get the indices of the intersecting rows of a main numpy 2d array A, with another one B. 我想获取一个主要的numpy 2d数组A与另一个B的相交行的索引。

A=array([[1, 2],
         [3, 4],
         [5, 6],
         [7, 8],
         [9, 10]])

B=array([[1, 4],
         [1, 2],
         [5, 6],
         [6, 3]])

result=[0,2]

Where this should return [0,2] based on the indices of array A. 其中应基于数组A的索引返回[0,2]。

How can this be done efficiently for 2d arrays? 对于二维阵列,如何有效地做到这一点?

Thank you! 谢谢!

edit 编辑

I have tried the function: 我已经尝试过该功能:

k[np.in1d(k.view(dtype='i,i').reshape(k.shape[0]),k2.view(dtype='i,i').
reshape(k2.shape[0]))]

from Implementation of numpy in1d for 2D arrays? 实现二维数组的numpy in1d? but I get a reshape error. 但是我遇到了重塑错误。 My datatype is floats (with two decimals). 我的数据类型是浮点数(带有两个小数)。 Moreover, I also tried with sets but the performance is quite slow. 而且,我也尝试过设置,但性能相当慢。

With minimal changes, you can get your approach to work: 只需进行最小的更改,即可获得工作方法:

In [15]: A
Out[15]: 
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [16]: B
Out[16]: 
array([[1, 4],
       [1, 2],
       [5, 6],
       [6, 3]])

In [17]: np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1))
Out[17]: array([ True, False,  True, False, False], dtype=bool)

In [18]: np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))
Out[18]: (array([0, 2], dtype=int64),)

In [19]: np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))[0]
Out[19]: array([0, 2], dtype=int64)

If your arrays are not floats, and are both contiguous, then the following will be faster: 如果您的数组不是浮点数,并且都是连续的,则以下操作会更快:

In [21]: dt = np.dtype((np.void, A.dtype.itemsize * A.shape[1]))

In [22]: np.nonzero(np.in1d(A.view(dt).reshape(-1), B.view(dt).reshape(-1)))[0]
Out[22]: array([0, 2], dtype=int64)

And a quick timing: 和一个快速的时机:

In [24]: %timeit np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))[0]
10000 loops, best of 3: 75 µs per loop

In [25]: %timeit np.nonzero(np.in1d(A.view(dt).reshape(-1), B.view(dt).reshape(-1)))[0]
10000 loops, best of 3: 29.8 µs per loop

You can use np.char.array() objects to do this comparison using np.in1d() : 您可以使用np.char.array()对象使用要做到这一点比较np.in1d()

s1 = np.char.array(A[:,0]) + '-' + np.char.array(A[:,1])
s2 = np.char.array(B[:,0]) + '-' + np.char.array(B[:,1])

np.where(np.in1d(s1, s2))[0]
#array([0, 2], dtype=int64)

NOTE : A and B must be of the same data type ( int , float , etc) for this to work. 注意AB必须具有相同的数据类型( intfloat等)才能起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM