简体   繁体   中英

Select rows from a DataFrame based on True or False in a column in pandas

Select rows from a DataFrame based on True or False in a column in pandas:

For example,

import pandas as pd
df = {'uid':["1", "1", "1", "1", "2", "2", "2", "2"], 
        'type': ["a", "a", "b", "a", "a", "b", "b", "a"], 
        'is_topup':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"],
       'label':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"]}
df = pd.DataFrame(df)  



   uid type  is_topup  label
0   1    a    FALSE  FALSE
1   1    a    FALSE  FALSE
2   1    b     TRUE   TRUE
3   1    a    FALSE  FALSE
4   2    a    FALSE  FALSE
5   2    b     TRUE   TRUE
6   2    b     TRUE   TRUE
7   3    a    FALSE  FALSE
8   3    b     TRUE   TRUE
9   3    b     TRUE   TRUE
10  3    a    FALSE  FALSE

I want to select a row in conditions like is

  uid type   is_topup  label
0   1    a    FALSE  FALSE
1   1    a    FALSE  FALSE
2   1    b     TRUE   TRUE
4   2    a    FALSE  FALSE
5   2    b     TRUE   TRUE
7   3    a    FALSE  FALSE
8   3    b     TRUE   TRUE

I tried to look at pandas documentation but did not find the answer.

Not sure the most efficient way but using idxmax :

new_df = df.groupby('uid').apply(lambda x: x[:(x['is_topup'] & x['label']).reset_index(drop=True).idxmax()+1])
print(new_df)

Output:

       uid type  is_topup  label
uid                             
1   0    1    a     False  False
    1    1    a     False  False
    2    1    b      True   True
2   4    2    a     False  False
    5    2    b      True   True
3   7    3    a     False  False
    8    3    b      True   True

It seems to me that a simple

result = df.drop_duplicates()

should do the trick. At least your given example would work that way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM