calculate percentage from a dataframe which has same id and multiple values in 'value' column

Question

I have a dataframe that has 45 unique values and corresponding are other values like 'bread', 'slice', jelly, and powder.

Here is what I have made up as the dataset:

Value_ID     Value
1000         bread
1000         bread
1000         bread
1000         bread
1000         jelly
1000         bread
1001         powder
1001         bread
1001         bread
1001         bread
1001         bread
1002         slice 
1002         powder
1002         bread
1002         jelly

Here, from the data I am trying to get the number(count) of Value_ID where the value-ID contains more than or equal to 80% bread, which in this case is 2 and value_id is 1001 and 1002.

Answer 1

You can use grouby.mean on the boolean Series to get the proportion on bread, then filter:

(df['Value'].eq('bread')
 .groupby(df['Value_ID']).mean()
 .loc[lambda x: x>=0.8]
 .index.to_list()
)

output: [1000, 1001]

Intermediate:

(df['Value'].eq('bread')
 .groupby(df['Value_ID']).mean()
)

output:

Value_ID
1000    0.833333
1001    0.800000
1002    0.250000
Name: Value, dtype: float64

calculate percentage from a dataframe which has same id and multiple values in 'value' column

Question

1 answers

solution1
1 2022-09-12 18:35:19

calculate percentage from a dataframe which has same id and multiple values in 'value' column

Question

1 answers

solution1 1 2022-09-12 18:35:19

solution1
1 2022-09-12 18:35:19