I need to look through my data set and find all values that meet the certain conditions. I have tried pandas.where(cond)
which just accept one condition.
For example consider the following data set:
a b c d
1 2 3 899
4 5 -344 21
7 8 9 10
I need this result: 0< data.values and data.values <30
a b c d
1 2 3 Nan
4 5 Nan 21
7 8 9 10
Most of the scripts return the rows or columns that meet the conditions. However I need the rest of the value in each column and row. For example I do not want to lose 2 and 3 in row first and 4 and 5 in row second.
Create boolean DataFrame
and apply boolean indexing
or use where
with 'invert conditions'
- <
to >=
and >
to <=
:
m = (df >= 0) & (df <= 30)
print (m)
a b c d
0 True True True False
1 True True False True
2 True True True True
df = df[m]
#alternatively
#df = df.where(m)
print (df)
a b c d
0 1 2 3.0 NaN
1 4 5 NaN 21.0
2 7 8 9.0 10.0
Numpy solution:
df = pd.DataFrame(np.where(m, df, np.nan), index=df.index, columns=df.columns)
print (df)
a b c d
0 1.0 2.0 3.0 NaN
1 4.0 5.0 NaN 21.0
2 7.0 8.0 9.0 10.0
Or use mask
:
m = (df < 0) | (df > 30)
df = df.mask(m)
print (df)
a b c d
0 1 2 3.0 NaN
1 4 5 NaN 21.0
2 7 8 9.0 10.0
This can be accomplished with a binary expression (which can be compound) as the selection criteria. Pandas overloads the dunder (double underscore) function for array subscripting to take a binary expression. A common problem in using this is that it is not a logical expression, so you need to use bit wise operators &
and |
in the expression when it is compound. These operators bind tighter than equality and comparison operators (eg ==
, >
, >=
) so you need to put your comparisons inside parentheses.
I believe the answer given by @jezrael will work. This is just an explanation of what s/he has provided.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.