简体   繁体   中英

Pandas dataframe check if a value exists in multiple columns for one row

I want to print out the row where the value is "True" for more than one column.

For example if data frame is the following:

   Remove  Ignore  Repair
0    True   False   False
1   False    True    True
2   False    True   False

I want it to print:

1

Is there an elegant way to do this instead of bunch of if statements?

  • you can use sum and pass axis=1 to sum over columns.
import pandas as pd
df = pd.DataFrame({'a':[False, True, False],'b':[False, True, False], 'c':[True, False, False,]})
print(df)
print("Ans: ",df[(df.sum(axis=1)>1)].index.tolist())

output:

      a       b      c
0   False   False   True
1   True    True    False
2   False   False   False
Ans: [1]

You can just sum booleans as they will be interpreted as True=1, False=0:

df.sum(axis=1) > 1

So to filter to rows where this evaluates as True:

df.loc[df.sum(axis=1) > 1]

Or the same thing but being more explicit about converting the booleans to integers:

df.loc[df.astype(int).sum(axis=1) > 1]

To get the first row that meets the criteria:

df.index[df.sum(axis=1).gt(1)][0]

Output:

Out[14]: 1

Since you can get multiple matches, you can exclude the [0] to get all the rows that meet your criteria

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM