简体   繁体   中英

Python Pandas group by with flag for category

I have a data frame like so:

|transaction_id|category|
-------------------------
|1234          |Book    |
|1234          |Car     |
|1234          |TV      |
|1235          |Car     |
|1235          |TV      |
|1236          |Car     |

And basically, I want to group by transaction_id and create a column that flags whether or not a transaction_id had a corresponding TV in the category column, so ideally the resulting data frame would look like this:

|transaction_id|HasTV?|
-----------------------
|1234          |Y     |
|1235          |Y     |
|1236          |N     |

I'm using pandas and I know how to use the groupby function, I've just never had to do something like this where there's a conditional check before

One option is to look at .unique() for the categories and then operate on the resulting Series:

In [28]: df.groupby("transaction_id")['category'].unique().apply(lambda x: 'TV' in x)
Out[28]:
transaction_id
1234.0     True
1235.0     True
1236.0    False
Name: category, dtype: bool

Another possibly faster but more obfuscated version would be to test for the desired category up front and then do the groupby:

In [29]: (df['category'] == 'TV').groupby(df["transaction_id"]).max()
Out[29]:
transaction_id
1234.0     True
1235.0     True
1236.0    False
Name: category, dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM