简体   繁体   中英

Efficient way to check which array rows fall between the values of two other vectors

I have a large numpy array of shape (20, 702000) , and I need to find which rows have values that fall between two other arrays. In the following simplified example,

a = np.array([[1,3,4,3,2,4],
              [4,6,2,5,8,6],
              [1,4,8,3,1,9],
              [9,4,2,5,6,5]])

range_low = np.array([0,2,3,2,1,3])
range_high = np.array([2,4,5,4,3,5])

a[0] is the only row that would fit the criteria and fall between range_low and range_high for each number.

I have tried something like,

[np.where(np.logical_and(x > range_low, x < range_high)) for x in a]

and it works, but in such a large array looping takes a very long time. Is there a quicker way to do something like this? Thanks!

If i understand the question correctly, you want all row indices that fit the critera? Then i would suggest this:

np.flatnonzero(((a > range_low) & (a < range_high)).all(axis=1))

If you have numpy version 1.20.0 you can also use the where keyword for numpy.all .

Quicker:

[np.where(x) for x in np.logical_and(a > range_low, a < range_high)]

Even quicker, but different format output:

np.where(np.logical_and(a > range_low, a < range_high))

Times:

print(timeit.timeit(lambda: [np.where(np.logical_and(x > range_low, x < range_high)) for x in a], number=100000)) # 1.3141384999999999
print(timeit.timeit(lambda: [np.where(x) for x in np.logical_and(a > range_low, a < range_high)], number=100000)) # 1.0557419000000001
print(timeit.timeit(lambda: np.where(np.logical_and(a > range_low, a < range_high)), number=100000)) # 0.4683885000000001

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM