Select rows from a DataFrame based on True or False in a column in pandas:
For example,
import pandas as pd
df = {'uid':["1", "1", "1", "1", "2", "2", "2", "2"],
'type': ["a", "a", "b", "a", "a", "b", "b", "a"],
'is_topup':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"],
'label':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"]}
df = pd.DataFrame(df)
uid type is_topup label
0 1 a FALSE FALSE
1 1 a FALSE FALSE
2 1 b TRUE TRUE
3 1 a FALSE FALSE
4 2 a FALSE FALSE
5 2 b TRUE TRUE
6 2 b TRUE TRUE
7 3 a FALSE FALSE
8 3 b TRUE TRUE
9 3 b TRUE TRUE
10 3 a FALSE FALSE
I want to select a row in conditions like is
uid type is_topup label
0 1 a FALSE FALSE
1 1 a FALSE FALSE
2 1 b TRUE TRUE
4 2 a FALSE FALSE
5 2 b TRUE TRUE
7 3 a FALSE FALSE
8 3 b TRUE TRUE
I tried to look at pandas documentation but did not find the answer.
Not sure the most efficient way but using idxmax
:
new_df = df.groupby('uid').apply(lambda x: x[:(x['is_topup'] & x['label']).reset_index(drop=True).idxmax()+1])
print(new_df)
Output:
uid type is_topup label
uid
1 0 1 a False False
1 1 a False False
2 1 b True True
2 4 2 a False False
5 2 b True True
3 7 3 a False False
8 3 b True True
It seems to me that a simple
result = df.drop_duplicates()
should do the trick. At least your given example would work that way.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.