I have two columns in my pandas_df. Category and Amount. My data looks like this:
category amount
home 20
home 10
fashion 20
fashion 10
celebrity 30
celebrity 40
I want to group the category column and get the sums for each category. I would also need to know the percentage for each category.
Expected output: home 30 - 23% etc
My code:
dict(df.groupby(['category'])['amount'].sum().sort_values(ascending=False))
Output: home 30 fashion 30 celebrity 70
I would first create a "percent" column:
df['percent'] = df['amount'] / sum(df['amount'])
Then, you can group by category and get the desired output, rounded to 2 decimal places:
df.groupby(['category']).sum().round(2)
The output will be:
amount percent
category
celebrity 70 0.54
fashion 30 0.23
home 30 0.23
Depending on your business case, it may be valuable to have the "percent" column for future calculations as the one you are doing. Therefore, including such column as part of your dataset may be reasonable.
groupby, agg sum and calculate percentage on the resulting sum.
g=df.groupby('category').agg(Sum=('amount','sum')).reset_index()#Calculate sum
g.assign(per=(g.Sum/(g.Sum.sum())*100).astype(int))#Calc the Percentage
category Sum per
0 celebrity 70 53
1 fashion 30 23
2 home 30 23
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.