[英]How do you sum a dataframe based off a grouping in Python pandas?
I have a for loop with the intent of checking for values greater than zero.我有一个 for 循环,目的是检查大于零的值。
Problem is, I only want each iteration to check the sum of a group of ID's.问题是,我只希望每次迭代都检查一组 ID 的总和。
The grouping would be a match of the first 8 characters of the ID string.分组将匹配 ID 字符串的前 8 个字符。
I have that grouping taking place before the loop but the loop still appears to search the entire df instead of each group.我在循环之前进行了分组,但循环似乎仍然搜索整个 df 而不是每个组。
LeftGroup = newDF.groupby(‘ID_Left_8’)
for g in LeftGroup.groups:
if sum(newDF[‘Hours_Calc’] > 0):
print(g)
Is there a way to filter that sum to each grouping of leftmost 8 characters?有没有办法将该总和过滤到每组最左边的 8 个字符?
I was expecting the.groups function to accomplish this, but it still seems to search every single ID.我期待 .groups 函数来完成这个,但它似乎仍然搜索每个 ID。
Thank you.谢谢你。
def filter_and_sum(group):
return sum(group[group['Hours_Calc'] > 0]['Hours_Calc'])
LeftGroup = newDF.groupby('ID_Left_8')
results = LeftGroup.apply(filter_and_sum)
print(results)
This will compute the sum of the Hours_Calc
column for each group, filtered by the condition Hours_Calc > 0
.这将计算每个组的
Hours_Calc
列的总和,按条件Hours_Calc > 0
过滤。 The resulting series will have the leftmost 8 characters as the index, and the sum of the Hours_Calc column as the value.结果系列将以最左边的 8 个字符作为索引,并将 Hours_Calc 列的总和作为值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.