[英]How to sum values grouped by a categorical column in pandas?
I have data which has a categorical column that groups the data and other columns likes this in a dataframe df
. 我的数据有一个分类列,用于对数据进行分组,其他列在数据帧
df
。
id subid value
1 10 1.5
1 20 2.5
1 30 7.0
2 10 12.5
2 40 5
What I need is a column that has the average value for each subid
within each id
. 我需要的是一个列,其中包含每个
id
每个subid
的平均值。 For example df
could be: 例如,
df
可能是:
id subid value id_sum proportion
1 10 1.5 11.0 0.136
1 20 2.5 11.0 0.227
1 30 7.0 11.0 0.636
2 10 12.5 17.5 0.714
2 40 5 17.5 0.285
Now, I tried getting the id_sum column by doing: 现在,我尝试通过执行以下操作来获取id_sum列:
df['id_sum'] = df.groupby('id')['value'].sum()
But this does not seem to work as hoped. 但这似乎没有像希望的那样奏效。 My end goal is to get the
proportion
column. 我的最终目标是获得
proportion
列。 What is the correct way of getting that? 得到这个的正确方法是什么?
here we go 开始了
df['id_sum'] = df.groupby('id')['value'].transform('sum')
df['proportion'] = df['value'] / df['id_sum']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.