I have a dataframe that has 45 unique values and corresponding are other values like 'bread', 'slice', jelly, and powder.
Here is what I have made up as the dataset:
Value_ID Value
1000 bread
1000 bread
1000 bread
1000 bread
1000 jelly
1000 bread
1001 powder
1001 bread
1001 bread
1001 bread
1001 bread
1002 slice
1002 powder
1002 bread
1002 jelly
Here, from the data I am trying to get the number(count) of Value_ID where the value-ID contains more than or equal to 80% bread, which in this case is 2 and value_id is 1001 and 1002.
You can use grouby.mean
on the boolean Series to get the proportion on bread, then filter:
(df['Value'].eq('bread')
.groupby(df['Value_ID']).mean()
.loc[lambda x: x>=0.8]
.index.to_list()
)
output: [1000, 1001]
Intermediate:
(df['Value'].eq('bread')
.groupby(df['Value_ID']).mean()
)
output:
Value_ID
1000 0.833333
1001 0.800000
1002 0.250000
Name: Value, dtype: float64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.