简体   繁体   English

获取基于另一列 pands 的分组值的百分比 python

[英]get the percentage of a grouped values based on another column pands python

I have two columns in my pandas_df.我的 pandas_df 中有两列。 Category and Amount.类别和金额。 My data looks like this:我的数据如下所示:

category          amount
home              20
home              10
fashion           20
fashion           10
celebrity         30
celebrity         40

I want to group the category column and get the sums for each category.我想对类别列进行分组并获取每个类别的总和。 I would also need to know the percentage for each category.我还需要知道每个类别的百分比。

Expected output: home 30 - 23% etc预期 output: home 30 - 23% etc

My code:我的代码:

dict(df.groupby(['category'])['amount'].sum().sort_values(ascending=False))

Output: home 30 fashion 30 celebrity 70 Output: home 30 fashion 30 celebrity 70

I would first create a "percent" column:我会首先创建一个“百分比”列:

df['percent'] = df['amount'] / sum(df['amount'])

Then, you can group by category and get the desired output, rounded to 2 decimal places:然后,您可以按类别分组并得到所需的 output,四舍五入到小数点后两位:

df.groupby(['category']).sum().round(2)

The output will be: output 将是:

          amount    percent
category
    
celebrity   70          0.54
fashion     30          0.23
home        30          0.23

Depending on your business case, it may be valuable to have the "percent" column for future calculations as the one you are doing.根据您的业务案例,将“百分比”列用于将来的计算可能很有价值,就像您正在做的那样。 Therefore, including such column as part of your dataset may be reasonable.因此,将此类列作为数据集的一部分可能是合理的。

groupby, agg sum and calculate percentage on the resulting sum. groupby、agg sum 并计算结果总和的百分比。

 g=df.groupby('category').agg(Sum=('amount','sum')).reset_index()#Calculate sum

g.assign(per=(g.Sum/(g.Sum.sum())*100).astype(int))#Calc the Percentage

    category  Sum  per
0  celebrity   70   53
1    fashion   30   23
2       home   30   23

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM