[英]How to sum up individual columns if they have the same value in a different column?
I have the following data frame:我有以下数据框:
Names Counts Year
0 Jordan 1043 2000
1 Steve 204 2000
2 Brock 3 2000
3 Steve 33 2000
4 Mike 88 2000
... ... ... ...
20001 Bryce 2 2015
20002 Steve 11 2015
20003 Penny 24 2015
20004 Steve 15 2015
20005 Penny 5 2015
I want to add up all the counts for each name if they appear multiple times in a year.如果每个名称在一年中出现多次,我想将它们的所有计数加起来。 An example of the output might look like:
output 的示例可能如下所示:
Names Counts Year
0 Jordan 1043 2000
1 Steve 237 2000
2 Brock 3 2000
3 Mike 88 2000
... ... ... ...
20001 Bryce 2 2015
20002 Steve 26 2015
20003 Penny 29 2015
I've tried the following:我尝试了以下方法:
(df[df['Names'].eq('Steve')].groupby('Year').agg({'Names': 'first', 'Counts': sum}).reset_index())
Which returns the following for individual names, but it's not what I'm looking for.它为个人名称返回以下内容,但这不是我想要的。
Year Names Counts
0 2000 Steve 237
1 2015 Steve 26
Try尝试
df['Counts'] = df.groupby(['Names','Year'])['Counts'].transform('sum')
The code that you shared looks like it is filtering the "Names" column for only the value "Steve".您共享的代码看起来像是在过滤“名称”列中的值“史蒂夫”。 The below code will group the unique pairs of "Name" & "Year" combinations and sum all related "Counts" values.
下面的代码将对“名称”和“年份”组合的唯一对进行分组,并对所有相关的“计数”值求和。
tempdf = df.groupby(['Names',"Year"])['Counts'].sum().reset_index()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.