[英]Sum second value in tuple for each given first value in tuples using Python
I'm working with a large set of records and need to sum a given field for each customer account to reach an overall account balance. 我正在处理大量记录,需要对每个客户帐户的给定字段求和,以达到整体帐户余额。 While I can probably put the data in any reasonable form, I figured the easiest would be a list of tuples (cust_id,balance_contribution) as I process through each record.
虽然我可以将数据以任何合理的形式放置,但我认为最简单的方法是在处理每条记录时列出元组(cust_id,balance_contribution)。 After the round of processing, I'd like to add up the second item for each cust_id, and I am trying to do it without looping though the data thousands of time.
经过一轮处理后,我想为每个cust_id加上第二个项目,并且我试图做到这一点而不会循环访问数据数千次。
As an example, the input data could look like: [(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(2,20.00)]
例如,输入数据可能看起来像:
[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(2,20.00)]
And I want the output to be something like this: 我希望输出是这样的:
[(1,125.00),(2,50.00)]
I've read other questions where people have just wanted to add the values of the second element of the tuple using the form of sum(i for i, j in a), but that does separate them by the first element. 我读过其他问题,人们只想使用sum(i表示i,j中的j)形式添加元组的第二个元素的值,但这确实将它们与第一个元素分开。
This discussion, python sum tuple list based on tuple first value , which puts the values as a list assigned to each key (cust_id) in a dictionary. 在此讨论中, python基于元组第一个值的sum元组列表 ,将值作为分配给字典中每个键(cust_id)的列表放置。 I suppose then I could figure out how to add each of the values in a list?
我想那我可以弄清楚如何在列表中添加每个值?
Any thoughts on a better approach to this? 有更好的方法吗?
Thank you in advance. 先感谢您。
import collections
def total(records):
dct = collections.defaultdict(int)
for cust_id, contrib in records:
dct[cust_id] += contrib
return dct.items()
Would the following code be useful? 以下代码有用吗?
in_list = [(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]
totals = {}
for uid, x in in_list :
if uid not in totals :
totals[uid] = x
else :
totals[uid] += x
print(totals)
output : 输出:
{1: 125.0, 2: 30.0, 3: 20.0}
People usually like one-liners in python: 人们通常喜欢python中的一线式:
[(uk,sum([vv for kk,vv in data if kk==uk])) for uk in set([k for k,v in data])]
When 什么时候
data=[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]
The output is 输出是
[(1, 125.0), (2, 30.0), (3, 20.0)]
Here's an itertools solution: 这是一个itertools解决方案:
from itertools import groupby
>>> x
[(1, 125.5), (2, 30.0), (1, 24.5), (1, -25.0), (2, 20.0)]
>>> sorted(x)
[(1, -25.0), (1, 24.5), (1, 125.5), (2, 20.0), (2, 30.0)]
>>> for a,b in groupby(sorted(x), key=lambda item: item[0]):
print a, sum([item[1] for item in list(b)])
1 125.0
2 50.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.