I have a data in which data are grouped together, but in my final output I need to output only that grouped data which satisfy the condition of containing both F and P values within a grouped. Grouped contain only either F or P will be discarded. Below table only those b_name will be selected which contains both F and P. From table XXXX, ZZZZ, BBBB will be selected and others not.
Input
Output
You could group by the column b_name
and then use filter
to keep only those groups that, simultaneously, have the F
and the P
values in the p_f
column (for each group). Next, remove the duplicated rows with drop_duplicates("b_name")
and set p_f
to the desired output.
import pandas as pd
df = pd.read_csv("sample.csv", sep=";")
print(df)
df_group = df.groupby("b_name")
df_filter = df_group.filter(lambda x:
("F" in x.p_f.values) and ("P" in x.p_f.values)
)
df_filter = df_filter.drop_duplicates("b_name")
df_filter["p_f"] = "FP"
print(df_filter[["b_id", "b_name", "p_f"]])
Output from df_filter
b_id b_name p_f
0 29743 XXXX FP
3 29751 ZZZZ FP
6 30832 BBBB FP
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.