I have a table with my customer data like this:
Customer Price
AAA 100
AAA 100
AAA 200
BBB 100
BBB 220
BBB 200
BBB 200
What I want to do is to find out the customer with the condition number of price >= 200 is more than number of price < 200
and add labels for them. for example:
Customer LABELS
AAA FALSE
BBB TRUE
any ideas for this issue?
df.Price.ge(200).groupby(df.Customer).mean().gt(.5)
Customer
AAA False
BBB True
Name: Price, dtype: bool
Or if you insist on your format
df.Price.ge(200).groupby(df.Customer).mean().gt(.5).reset_index(name='Labels')
Customer Labels
0 AAA False
1 BBB True
Straightforward answer:
df.groupby('Customer').apply(
lambda g: (g['Price'] >= 200).sum() > (g['Price'] < 200).sum()
)
Summing a boolean vector will return the number of True
values.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.