Python Pandas group by with flag for category

Question

I have a data frame like so:

|transaction_id|category|
-------------------------
|1234          |Book    |
|1234          |Car     |
|1234          |TV      |
|1235          |Car     |
|1235          |TV      |
|1236          |Car     |

And basically, I want to group by transaction_id and create a column that flags whether or not a transaction_id had a corresponding TV in the category column, so ideally the resulting data frame would look like this:

|transaction_id|HasTV?|
-----------------------
|1234          |Y     |
|1235          |Y     |
|1236          |N     |

I'm using pandas and I know how to use the groupby function, I've just never had to do something like this where there's a conditional check before

Answer 1

One option is to look at .unique() for the categories and then operate on the resulting Series:

In [28]: df.groupby("transaction_id")['category'].unique().apply(lambda x: 'TV' in x)
Out[28]:
transaction_id
1234.0     True
1235.0     True
1236.0    False
Name: category, dtype: bool

Another possibly faster but more obfuscated version would be to test for the desired category up front and then do the groupby:

In [29]: (df['category'] == 'TV').groupby(df["transaction_id"]).max()
Out[29]:
transaction_id
1234.0     True
1235.0     True
1236.0    False
Name: category, dtype: bool

Python Pandas group by with flag for category

Question

1 answers

solution1
1 ACCPTED 2019-09-17 18:27:37

Python Pandas group by with flag for category

Question

1 answers

solution1 1 ACCPTED 2019-09-17 18:27:37

solution1
1 ACCPTED 2019-09-17 18:27:37