简体   繁体   English

如何对pandas中分类列分组的值进行求和?

[英]How to sum values grouped by a categorical column in pandas?

I have data which has a categorical column that groups the data and other columns likes this in a dataframe df . 我的数据有一个分类列,用于对数据进行分组,其他列在数据帧df

id      subid      value
1       10         1.5
1       20         2.5
1       30         7.0 
2       10         12.5
2       40         5

What I need is a column that has the average value for each subid within each id . 我需要的是一个列,其中包含每个id每个subid的平均值。 For example df could be: 例如, df可能是:

id      subid      value     id_sum    proportion
1       10         1.5       11.0      0.136
1       20         2.5       11.0      0.227
1       30         7.0       11.0      0.636
2       10         12.5      17.5      0.714
2       40         5         17.5      0.285

Now, I tried getting the id_sum column by doing: 现在,我尝试通过执行以下操作来获取id_sum列:

df['id_sum'] = df.groupby('id')['value'].sum()

But this does not seem to work as hoped. 但这似乎没有像希望的那样奏效。 My end goal is to get the proportion column. 我的最终目标是获得proportion列。 What is the correct way of getting that? 得到这个的正确方法是什么?

here we go 开始了

df['id_sum'] = df.groupby('id')['value'].transform('sum')
df['proportion'] = df['value'] / df['id_sum']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM