简体   繁体   中英

get the percentage of a grouped values based on another column pands python

I have two columns in my pandas_df. Category and Amount. My data looks like this:

category          amount
home              20
home              10
fashion           20
fashion           10
celebrity         30
celebrity         40

I want to group the category column and get the sums for each category. I would also need to know the percentage for each category.

Expected output: home 30 - 23% etc

My code:

dict(df.groupby(['category'])['amount'].sum().sort_values(ascending=False))

Output: home 30 fashion 30 celebrity 70

I would first create a "percent" column:

df['percent'] = df['amount'] / sum(df['amount'])

Then, you can group by category and get the desired output, rounded to 2 decimal places:

df.groupby(['category']).sum().round(2)

The output will be:

          amount    percent
category
    
celebrity   70          0.54
fashion     30          0.23
home        30          0.23

Depending on your business case, it may be valuable to have the "percent" column for future calculations as the one you are doing. Therefore, including such column as part of your dataset may be reasonable.

groupby, agg sum and calculate percentage on the resulting sum.

 g=df.groupby('category').agg(Sum=('amount','sum')).reset_index()#Calculate sum

g.assign(per=(g.Sum/(g.Sum.sum())*100).astype(int))#Calc the Percentage

    category  Sum  per
0  celebrity   70   53
1    fashion   30   23
2       home   30   23

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM