简体   繁体   English

Numpy 在两个 arrays 中找到相同的元素

[英]Numpy find identical element in two arrays

Suppose I have an array a and b , how can I find the identical element in both arrays?假设我有一个数组ab ,如何在两个 arrays 中找到相同的元素?

a = np.array([[262.5, 262.5, 45],
              [262.5, 262.5, 15],
              [262.5, 187.5, 45],
              [262.5, 187.5, 15],
              [187.5, 262.5, 45],
              [187.5, 262.5, 15],
              [187.5, 187.5, 45],
              [187.5, 187.5, 15]])

b = np.array([[262.5, 262.5, 45],
              [262.5, 262.5, 15],
              [3,3,5],
              [5,5,7],
              [8,8,9]])

I tried the code below, but the output is not what I want, can anyone tell me what is wrong with this code?我尝试了下面的代码,但 output 不是我想要的,谁能告诉我这段代码有什么问题? or is there any other way to do it?或者还有其他方法吗?

out = [x[(x == b[:,None]).all(1).any(0)] for x in a]

The output I want is:我要的output是:

array[[262.5, 262.5, 45],
      [262.5, 262.5, 15]]

If you are not tied to using np all the way (which I think is the case, seeing the list comprehension) - you can do a set intersection如果您没有一直使用np (我认为是这种情况,请参阅列表理解) - 您可以进行set交集

x = set(map(tuple, a)).intersection(set(map(tuple, b)))
print(x)
# {(262.5, 262.5, 15.0), (262.5, 262.5, 45.0)}

You can convert this back to a np.ndarray by您可以通过以下方式将其转换回np.ndarray

xarr = np.array(list(x)) 
print(xarr)
# array([[262.5, 262.5,  45. ],
#       [262.5, 262.5,  15. ]])
a[np.all([np.isin(ai, b) for ai in  a], axis=1)]

or also:或者还有:

b[np.all([np.isin(bi, a) for bi in  b], axis=1)]

It is not clear if you want the first contiguous block or not.目前尚不清楚您是否想要第一个连续的块。 Let assume not, and that you want to retrieve all rows of same index in both arrays and for which all elements are equal:假设不是,并且您想要检索 arrays 中相同索引的所有行,并且所有元素都相等:

import numpy as np

a = np.array(
    [
        [1, 1, 1],
        [2, 2, 2],
        [3, 3, 3],
        [4, 4, 4],
        [5, 5, 5],
        [6, 6, 6],
    ]
)

b = np.array(
    [
        [1, 1, 1],
        [2, 2, 2],
        [0, 0, 0],
        [0, 0, 0],
        [5, 5, 5],
    ]
)

expected = np.array(
    [
        [1, 1, 1],
        [2, 2, 2],
        [5, 5, 5],
    ]
)

First method is using a for loop, but might not be efficient:第一种方法是使用 for 循环,但可能效率不高:

out = np.array([x for x, y in zip(a, b) if np.all(x == y)])
assert np.all(out == expected)

Second method is vectorized and so much more efficient, you just need to crop your arrays beforehand because they don't have the same length ( zip does that silently):第二种方法是矢量化的并且效率更高,您只需要事先裁剪 arrays 因为它们的长度不同( zip会默默地这样做):

num_rows = min(a.shape[0], b.shape[0])
a_ = a[:num_rows]
b_ = b[:num_rows]

rows_mask = np.all(a_ == b_, axis=-1)
out = a_[rows_mask, :]

assert np.all(out == expected)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM