简体   繁体   English

对字典中具有相同键的值求和

[英]Sum the values that have the same key in a dictionary

I have this list of dictionaries: 我有以下词典列表:

[{'Total Incidents': '1', 'CrimeTime': '19'}, 
 {'Total Incidents': '1', 'CrimeTime': '19'}, 
 {'Total Incidents': '1', 'CrimeTime': '19'}, 
 {'Total Incidents': '1', 'CrimeTime': '20'}, 
 {'Total Incidents': '1', 'CrimeTime': '20'},
 {'Total Incidents': '1', 'CrimeTime': '21'},
 {'Total Incidents': '1', 'CrimeTime': '21'}]

I need to convert the value of 'Total Incidents' to int, and sum them up, for each incident that occured in the same hour (the minutes and seconds are irrelevant). 对于同一小时内发生的每个事件,我需要将“ Total Incidents”的值转换为int,并对其求和(分钟和秒无关)。 The output should be something like this: 输出应该是这样的:

[{'Total Incidents': 3, 'CrimeTime': '19'},
 {'Total Incidents': 2, 'CrimeTime': '20'},
 {'Total Incidents': 2, 'CrimeTime': '21'}]

I have used this method: 我用过这种方法:

 [{ 'CrimeTime':       g[0], 
    'Total Incidents': sum(map(lambda x: int(x['Total Incidents']), g[1])) } 
  for g in itertools.groupby(mydata, lambda x: x['CrimeTime']) ]

But unfortunately, sometimes it repeats 'CrimeTime' so I get two dictionaries with the same 'CrimeTime', instead of only one with the summed incidents. 但是不幸的是,有时它会重复“ CrimeTime”,所以我得到两个具有相同“ CrimeTime”的字典,而不是只有一个发生事件总数的字典。 The original list is a lot bigger, I just used a short version to explain myself better. 原始列表要大得多,我只是用一个简短的版本来更好地解释自己。

Feel free to ask if you don't understand my question so that I can explain myself a bit better. 随意问您是否不明白我的问题,以便我可以更好地解释自己。

In most contexts (as in yours), itertools.groupby works best if the data is sorted by the grouping key because it only groups adjacent elements: 在大多数情况下(如您的情况), itertools.groupby如果按分组键对数据进行排序,则效果最佳,因为它仅对相邻元素进行分组:

key = lambda x: x['CrimeTime']
[
    {'CrimeTime': k, 'Total Incidents': sum(int(x['Total Incidents']) for x in g)} 
    for k, g in itertools.groupby(sorted(mydata, key=key), key=key)
]

Using the generator expression instead of the map - lambda is mostly a matter of taste, but, at least in Python 2, saves you some resources by not building an intermediate list . 使用生成器表达式而不是map - lambda通常是一个问题,但是,至少在Python 2中,由于不构建中间list ,因此为您节省了一些资源。

This should work : (too late ;) ) 这应该工作:(为时已晚;))

 [
   {'CrimeTime': g[0], 'Total Incidents': sum(map(lambda x: int(x['Total Incidents']), g[1]))} 
      for g in 
        itertools.groupby(
          sorted(data,key=lambda x:x.values()[0]), lambda x: x['CrimeTime'])
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM