I have a situation in which I have an ndarray X of floats, let's say 100x10, and I want to look at some conditions on the first column and create a boolean ndarray B of shape 100x1. Then I want to use B as an index into X to pull out values where a True is located. However for each True in BI want to pull out the entire row of X. I thought this would work automatically, as B would be broadcast to a 100x10 shape. However it doesn't seem to work this way. Here's an example using 2x2 and 2x1 ndarrays.
a = np.array([True, False])
a.shape = (2,1)
b = np.array([1, 2, 3, 4])
b.shape = (2,2)
print(a)
print(b)
print(b[a])
This prints
[[True]
[False]]
[[ 1 2 ]
[ 3 4 ]]
[1]
I expected it to print [1 2]
. Why doesn't the broadcasting work the way I expect?
The rules for so-called "fancing indexing" are detailed here . In particular, when the index, obj
, is a NumPy array of dtype bool
, x[obj]
... is always equivalent to (but faster than) x[obj.nonzero()] where, as described above, obj.nonzero() returns a tuple (of length obj.ndim) of integer index arrays showing the True elements of obj.
Since,
In [4]: a.nonzero()
Out[4]: (array([0]), array([0]))
b[a]
is equivalent to b[a.nonzero()]
which is
In [6]: b[(np.array([0]), np.array([0]))]
Out[6]: array([1])
In [7]: b[a]
Out[7]: array([1])
If you want to use a boolean array a
to select rows of b
, then, as Joran Beasley states, just keep a
as a 1-dimensional boolean array:
import numpy as np
a = np.array([True, False])
b = np.array([1, 2, 3, 4])
b.shape = (2,2)
print(b[a])
# [[1 2]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.