Indexing large 2d numpy array by conditions

Question

I have a 2D NumPy array that of shape (1500, 3712). I want to find the indices of the array that have values between -10 and -40. So far I have:

for item in lon_array:
    for value in item:
        if value >= -40 and value <= -10:
            find_index = np.where(lon_array == value)
            index = np.asarray(find_index).T

Since it is a really big array, is there any way to make this faster?

Answer 1

Assuming your lon_array is a NumPy array you can use the following methods:

find_index = np.where(np.logical_and(lon_array >= -40, lon_arr <= -10))
index =  np.asarray(find_index).T

Since np.where takes only one condition, you can combine two to get the between condition with np.logical_and .

It can be done as a one-liner too:

>>> lon_arr
array([[ 20, -40],
       [ 30, -30],
       [ 20, -14],
       [ 30, -30]])
>>> np.asarray(np.where(np.logical_and(lon_arr>=-40, lon_arr<=-10))).T
array([[0, 1],
       [1, 1],
       [2, 1],
       [3, 1]])

Answer 2

If lon_array is a list of lists (built-in basic data type of Python), using enumerate(...) would be the best approach to know the indices of an element:

for y, row in enumerate(lon_array):
    for x, value in enumerate(row):
        if -40 <= value <= -10:
            index = (y, x)
            # do something useful with it...

Answer 3

With numpy.where , you can extract the indices based on the value of your data, and keep the iterations over your data executed in the optimized c code on to which numpy is built:

import numpy as np

x = np.random.random(10).reshape(2, 5)
print(x)
indices = np.where(x < 0.2)  #<--- this selects the indices based on a filter
print(indices)
x[indices]

output:

[[ 0.11129659  0.33608351  0.07542966  0.44118394  0.14848829]
 [ 0.8475123   0.27994122  0.91797756  0.02662857  0.52820238]]

# These are the indices produced by np.where:
# the first array contains the rows `i` and the second the columns `j`
(array([0, 0, 0, 1]), array([0, 2, 4, 3]))

array([ 0.11129659,  0.07542966,  0.14848829,  0.02662857])

Because you have two conditions in your filter, I suggest you use the following construct that allows to build a more complex boolean expression than what np.where directly accepts:

indices = np.where(np.logical_or(x < 0.2, x > 0.8))

Answer 4

For completeness, you could use numpy.indices as an alternative to the numpy.where based solution that was proposed in other answers:

In [1245]: lon_array = np.linspace(-70, 20, 10).reshape(2, 5)

In [1246]: lon_array
Out[1246]: 
array([[-70., -60., -50., -40., -30.],
       [-20., -10.,   0.,  10.,  20.]])

In [1247]: idx = np.nonzero(np.logical_and(lon_array >= -40, lon_array <= -10))

In [1248]: idx
Out[1248]: (array([0, 0, 1, 1], dtype=int64), array([3, 4, 0, 1], dtype=int64))

In [1249]: lon_array[idx]
Out[1249]: array([-40., -30., -20., -10.])

The demo above shows that NumPy's nonzero returns a tuple of 1D arrays which triggers advanced indexing when used as a selection object.

Indexing large 2d numpy array by conditions

Question

4 answers

solution1
3 2016-04-26 13:57:24

solution2
1 2016-04-26 13:49:46

solution3
0 2016-04-26 13:55:01

output:

solution4
0 2016-04-27 15:41:42

Indexing large 2d numpy array by conditions

Question

4 answers

solution1 3 2016-04-26 13:57:24

solution2 1 2016-04-26 13:49:46

solution3 0 2016-04-26 13:55:01

output:

solution4 0 2016-04-27 15:41:42

solution1
3 2016-04-26 13:57:24

solution2
1 2016-04-26 13:49:46

solution3
0 2016-04-26 13:55:01

solution4
0 2016-04-27 15:41:42