简体   繁体   中英

Numpy.where used with list of values

I have a 2d and 1d array. I am looking to find the two rows that contain at least once the values from the 1d array as follows:

import numpy as np

A = np.array([[0, 3, 1],
           [9, 4, 6],
           [2, 7, 3],
           [1, 8, 9],
           [6, 2, 7],
           [4, 8, 0]])

B = np.array([0,1,2,3])

results = []

for elem in B:
    results.append(np.where(A==elem)[0])

This works and results in the following array:

[array([0, 5], dtype=int64),
 array([0, 3], dtype=int64),
 array([2, 4], dtype=int64),
 array([0, 2], dtype=int64)]

But this is probably not the best way of proceeding. Following the answers given in this question ( Search Numpy array with multiple values ) I tried the following solutions:

out1 = np.where(np.in1d(A, B))

num_arr = np.sort(B)
idx = np.searchsorted(B, A)
idx[idx==len(num_arr)] = 0 
out2 = A[A == num_arr[idx]]

But these give me incorrect values:

In [36]: out1
Out[36]: (array([ 0,  1,  2,  6,  8,  9, 13, 17], dtype=int64),)

In [37]: out2
Out[37]: array([0, 3, 1, 2, 3, 1, 2, 0])

Thanks for your help

Since you're dealing with a 2D array * you can use broadcasting to compare B with raveled version of A . This will give you the respective indices in a raveled shape. Then you can reverse the result and get the corresponding indices in original array using np.unravel_index .

In [50]: d = np.where(B[:, None] == A.ravel())[1]

In [51]: np.unravel_index(d, A.shape)
Out[51]: (array([0, 5, 0, 3, 2, 4, 0, 2]), array([0, 2, 2, 0, 0, 1, 1, 2]))                 
                       ^
               # expected result 

* From documentation : For 3-dimensional arrays this is certainly efficient in terms of lines of code, and, for small data sets, it can also be computationally efficient. For large data sets, however, the creation of the large 3-d array may result in sluggish performance. Also, Broadcasting is a powerful tool for writing short and usually intuitive code that does its computations very efficiently in C. However, there are cases when broadcasting uses unnecessarily large amounts of memory for a particular algorithm. In these cases, it is better to write the algorithm's outer loop in Python. This may also produce more readable code, as algorithms that use broadcasting tend to become more difficult to interpret as the number of dimensions in the broadcast increases.

If you need to know whether each row of A contains ANY element of array B without interest in which particular element of B it is, the following script can be used:

input:

np.isin(A,B).sum(axis=1)>0 

output:

array([ True, False,  True,  True,  True,  True])

Is something like this what you are looking for?

import numpy as np from itertools import combinations

A = np.array([[0, 3, 1],
           [9, 4, 6],
           [2, 7, 3],
           [1, 8, 9],
           [6, 2, 7],
           [4, 8, 0]])

B = np.array([0,1,2,3])

for i in combinations(A, 2):
    if np.all(np.isin(B, np.hstack(i))):
        print(i[0], ' ', i[1])

which prints the following:

[0 3 1]   [2 7 3]
[0 3 1]   [6 2 7]

note: this solution does NOT require the rows be consecutive. Please let me know if that is required.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM