简体   繁体   中英

Python Pandas Dataframe - Drop group based on Groupby and filter condition

I have a dataframe that has the following columns. I want to group the data by the order number and then drop all the groups that do not contain one specific item.

order_id product_id purchase_date
1234 23546.0. 2020-01-10.
1234. 32423.0 2020-01-10.
5678. 43244.0. 2020-02-10.

when I use the line below if doesn't drop order_id 5678

df6 = df2.groupby(by='order_id').filter(lambda df2: df2['product_id'] == 23546.0)

I get the error: 'DataFrame' object is not callable

Use:

df.loc[df['product_id'].eq('23546.0.').groupby(df['order_id']).transform('any')]

   order_id product_id purchase_date
0    1234.0   23546.0.   2020-01-10.
1    1234.0    32423.0   2020-01-10.

if product_id is float

df.loc[df['product_id'].eq(23546.0).groupby(df['order_id']).transform('any')]

Another solution:

df_out = df.groupby(by="order_id").filter(lambda x: 23546.0 in x["product_id"].values)
print(df_out)

Prints:

   order_id  product_id purchase_date
0    1234.0     23546.0    2020-01-10
1    1234.0     32423.0    2020-01-10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM