根据另一个 numpy 数组中的值查找 numpy 数组的索引

Question

I want to find the indices in a larger array if they match the values of a different, smaller array.如果索引与另一个较小的数组的值匹配，我想在更大的数组中找到它们。 Something like new_array below:像下面的new_array这样的东西：

import numpy as np
summed_rows = np.random.randint(low=1, high=14, size=9999)
common_sums = np.array([7,10,13])
new_array = np.where(summed_rows == common_sums)

However, this returns:但是，这返回：

__main__:1: DeprecationWarning: elementwise comparison failed; this will raise an error in the future. 
>>>new_array 
(array([], dtype=int64),)

The closest I've gotten is:我得到的最接近的是：

new_array = [np.array(np.where(summed_rows==important_sum)) for important_sum in common_sums[0]]

This gives me a list with three numpy arrays (one for each 'important sum'), but each is a different length which produces further downstream problems with concatenation and vstacking.这给了我一个包含三个 numpy arrays 的列表（每个“重要金额”一个），但每个长度不同，这会产生进一步的串联和 vstacking 下游问题。 To be clear, I do not want to use the line above.明确地说，我不想使用上面的行。 I want to use numpy to index into summed_rows .我想使用 numpy 索引到summed_rows 。 I've looked at various answers using numpy.where , numpy.argwhere , and numpy.intersect1d , but am having trouble putting the ideas together.我已经使用numpy.where 、 numpy.argwhere和numpy.intersect1d查看了各种答案，但无法将这些想法放在一起。 I figured I'm missing something simple and it would be faster to ask.我想我错过了一些简单的东西，问起来会更快。

Thanks in advance for your recommendations!提前感谢您的建议！

Answer 1

Taking into account the proposed options on the comments, and adding an extra option with numpy's in1d option:考虑到评论中建议的选项，并使用 numpy 的 in1d 选项添加一个额外的选项：

>>> import numpy as np
>>> summed_rows = np.random.randint(low=1, high=14, size=9999)
>>> common_sums = np.array([7,10,13])
>>> ind_1 = (summed_rows==common_sums[:,None]).any(0).nonzero()[0]   # Option of @Brenlla
>>> ind_2 = np.where(summed_rows == common_sums[:, None])[1]   # Option of @Ravi Sharma
>>> ind_3 = np.arange(summed_rows.shape[0])[np.in1d(summed_rows, common_sums)]
>>> ind_4 = np.where(np.in1d(summed_rows, common_sums))[0]
>>> ind_5 = np.where(np.isin(summed_rows, common_sums))[0]   # Option of @jdehesa

>>> np.array_equal(np.sort(ind_1), np.sort(ind_2))
True
>>> np.array_equal(np.sort(ind_1), np.sort(ind_3))
True
>>> np.array_equal(np.sort(ind_1), np.sort(ind_4))
True
>>> np.array_equal(np.sort(ind_1), np.sort(ind_5))
True

If you time it, you can see that all of them are quite similar, but @Brenlla's option is the fastest one如果你计时，你会发现它们都非常相似，但@Brenlla 的选项是最快的

python -m timeit -s 'import numpy as np; np.random.seed(0); a = np.random.randint(low=1, high=14, size=9999); b = np.array([7,10,13])' 'ind_1 = (a==b[:,None]).any(0).nonzero()[0]'
10000 loops, best of 3: 52.7 usec per loop

python -m timeit -s 'import numpy as np; np.random.seed(0); a = np.random.randint(low=1, high=14, size=9999); b = np.array([7,10,13])' 'ind_2 = np.where(a == b[:, None])[1]'
10000 loops, best of 3: 191 usec per loop

python -m timeit -s 'import numpy as np; np.random.seed(0); a = np.random.randint(low=1, high=14, size=9999); b = np.array([7,10,13])' 'ind_3 = np.arange(a.shape[0])[np.in1d(a, b)]'
10000 loops, best of 3: 103 usec per loop

python -m timeit -s 'import numpy as np; np.random.seed(0); a = np.random.randint(low=1, high=14, size=9999); b = np.array([7,10,13])' 'ind_4 = np.where(np.in1d(a, b))[0]'
10000 loops, best of 3: 63 usec per loo

python -m timeit -s 'import numpy as np; np.random.seed(0); a = np.random.randint(low=1, high=14, size=9999); b = np.array([7,10,13])' 'ind_5 = np.where(np.isin(a, b))[0]'
10000 loops, best of 3: 67.1 usec per loop

Answer 2

For anyone loking for this for not equal numbers in the array but nearest equal value, this is a straight forward way to do the same for not exactly equal values.对于任何在数组中寻找不相等数字但最接近相等值的人来说，这是对不完全相等的值执行相同操作的直接方法。 for huge summed_rows, might be memory intensive.对于巨大的 summed_rows，可能是 memory 密集型。

    import numpy  
    summed_rows = np.random.randint(low=1, high=14, size=9999) 
    common_sums = np.array([7,10,13])
    
    repeat_array = np.repeat(summed_rows, len(common_sums)).reshape(len(summed_rows), len(common_sums)) 
    search_index = np.argmin(np.abs(repeat_array - common_sums), axis=0)

Answer 3

Usenp.isin :使用np.isin ：

import numpy as np
summed_rows = np.random.randint(low=1, high=14, size=9999)
common_sums = np.array([7, 10, 13])
new_array = np.where(np.isin(summed_rows, common_sums))

根据另一个 numpy 数组中的值查找 numpy 数组的索引

问题描述

3 个解决方案

解决方案1
2 已采纳 2019-09-23 18:00:51

解决方案2
1 2023-01-27 20:37:06

解决方案3
0 2019-09-23 18:07:22

根据另一个 numpy 数组中的值查找 numpy 数组的索引

问题描述

3 个解决方案

解决方案1 2 已采纳 2019-09-23 18:00:51

解决方案2 1 2023-01-27 20:37:06

解决方案3 0 2019-09-23 18:07:22

解决方案1
2 已采纳 2019-09-23 18:00:51

解决方案2
1 2023-01-27 20:37:06

解决方案3
0 2019-09-23 18:07:22