I have a numpy array like below. I need a count of rows where the first element is 2. So in the array below, four rows start with 2 - the answer would be 4. How is this best accomplished in numpy? (I cannot use pandas, but can use scipy).
array([[1, 4, 5],
[1, 4, 5],
[2, 4, 5],
[2, 4, 5],
[2, 4, 5],
[2, 4, 5],
[3, 4, 5],
[3, 4, 5],
[3, 4, 5],
[3, 4, 5],
[3, 4, 5],
[3, 4, 5]])
First, take the first column, all rows:
a[:,0]
Then, find the 2
s:
a[:,0] == 2
That gives you a boolean array. Which you can then sum:
(a[:,0] == 2).sum()
There is np.count_nonzero
which in a common idiom is applied to logical arrays generated by evaluating a condition
np.count_nonzero(data[:, 0] == 2)
Btw. it's probably just for the sake of example, but if your array is sorted like yours you can also use np.searchsorted
np.diff(np.searchsorted(data[:, 0], (2, 3)))[0]
One more approach in addition to above approaches
>>> x[:,0]==2
array([False, False, True, True, True, True, False, False, False,
False, False, False], dtype=bool)
will give you TRUE for the rows which have first column as 2.
>>> x[x[:,0]==2]
array([[2, 4, 5],
[2, 4, 5],
[2, 4, 5],
[2, 4, 5]])
gives you corresponding rows and which satisfy the required condition. Now, you can use shape function to get length.
x[x[:,0]==2].shape[0]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.