I'm quyering a pandas dataframe df
like this.
df = df[
(df.value1 >= threshold1) &
(df.value2 >= threshold2) &
(df.value3.isin(list3))
]
Python has the built in function all , which allows this syntax:
if all([
value1 > threshold1,
value2 > threshold2,
value3 in list3,
]):
Instead of this:
if (
value1 > threshold1 and
value2 > threshold2 and
value3 in list3,
):
Does Pandas have something similar to all
in Python? Thanks.
Also, is this the fastest way of subsetting a Pandas dataframe based on multiple conditions?
@juanpa.arrivillaga already gave you a very good explanation about boolean indexing in Pandas.
I'd like to give you a bit nicer alternative - DataFrame.query() method:
df.query("value1 > @threshold1 and value2 > @threshold2 and value3 in @list3")
Demo:
In [138]: df = pd.DataFrame(np.random.randint(1, 10, (10, 3)),
columns=['value1','value2','value3'])
In [139]: df
Out[139]:
value1 value2 value3
0 7 9 1
1 4 1 3
2 3 8 8
3 2 8 9
4 9 2 7
5 5 8 9
6 4 2 9
7 7 2 5
8 6 3 5
9 9 1 5
In [140]: threshold1 = 2
In [141]: threshold2 = 4
In [142]: list3 = [1,9]
In [143]: df.query("value1 > @threshold1 and value2 > @threshold2 and value3 in @list3")
Out[143]:
value1 value2 value3
0 7 9 1
5 5 8 9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.