[英]How to Sum by Column in Pandas DF and Remove Additional Rows
I have a dataframe in the form:我有一个 dataframe 的形式:
Sales House Station Day Date Time Daypart Total Unique Key
0 CARLTON CARLTON Mon 3AUG20 1213 DAYTIME 0 CARLTON_ 3AUG20
1 CARLTON CARLTON Mon 3AUG20 2307 POSTPEAK 30 CARLTON_ 3AUG20
2 CARLTON CARLTON Tue 4AUG20 1015 COFFEE 30 NaN
3 CARLTON CARLTON Tue 4AUG20 1027 COFFEE 30 CARLTON_ 4AUG20
4 CARLTON CARLTON Wed 5AUG20 1310 DAYTIME 30 CARLTON_ 5AUG20
The Unique Key
column is just a column I have added to try make this process easier (correct me if I am wrong please.). Unique Key
列只是我添加的一个列,以尝试使此过程更容易(如果我错了,请纠正我。)。 Essentially I would like to sum the Total
column by using the Unique Key
column, but also remove the extra rows associated with the Unique Key
and only leaving one..本质上,我想通过使用Unique Key
列对Total
列求和,但也删除与Unique Key
关联的额外行,只留下一个..
As an example, the above df would come out as the below.例如,上面的 df 将如下所示。 In this instance there is a match for row 1 and row 2, which the Total
row should be summed, and then row 2 removed..在这种情况下,第 1 行和第 2 行存在匹配项,应将Total
行相加,然后删除第 2 行。
Sales House Station Day Date Time Daypart Total Unique Key
0 CARLTON CARLTON Mon 3AUG20 1213 DAYTIME 30 CARLTON_ 3AUG20
1 CARLTON CARLTON Tue 4AUG20 1015 COFFEE 30 NaN
2 CARLTON CARLTON Tue 4AUG20 1027 COFFEE 30 CARLTON_ 4AUG20
3 CARLTON CARLTON Wed 5AUG20 1310 DAYTIME 30 CARLTON_ 5AUG20
Is there a way to easily do this?有没有办法轻松做到这一点?
Seems like you need df.groupby()
method.好像你需要df.groupby()
方法。
I would try doing this in three steps:我会尝试分三个步骤执行此操作:
aggregated = df.groupby(['Station', 'Date'])['Total'].sum().reset_index() # Getting sum
df = df.drop_duplicates(['Station', 'Date']) # Removing duplicated rows
df = df.drop('Total', axis=1).merge(aggregated, on=['Station', 'Date']) # Merge back
Edited according to the comment (added df = df.drop_duplicates(['Station', 'Date'])
) line in order to remove duplicates.根据注释(添加df = df.drop_duplicates(['Station', 'Date'])
)行进行编辑,以删除重复项。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.