简体   繁体   中英

Find out the most frequency combination and add labels

I have a table with my customer data like this:

Customer    Price
AAA            100
AAA            100
AAA            200
BBB            100
BBB            220
BBB            200
BBB            200

What I want to do is to find out the customer with the condition number of price >= 200 is more than number of price < 200 and add labels for them. for example:

Customer    LABELS
AAA            FALSE
BBB            TRUE

any ideas for this issue?

df.Price.ge(200).groupby(df.Customer).mean().gt(.5)

Customer
AAA    False
BBB     True
Name: Price, dtype: bool

Or if you insist on your format

df.Price.ge(200).groupby(df.Customer).mean().gt(.5).reset_index(name='Labels')

  Customer  Labels
0      AAA   False
1      BBB    True

Straightforward answer:

df.groupby('Customer').apply(
    lambda g: (g['Price'] >= 200).sum() > (g['Price'] < 200).sum()
)

Summing a boolean vector will return the number of True values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM