如果有一个类似的键（元组），Python汇总一个字典的元素

Question

I have one single dictionary looking like this: 我有一个单一的字典看起来像这样：

{('20144', 'Wirtschaftskammer Österreich Fachverband der Telekommunikations- und Rundfunkunternehmungen', 'Bezirksrundschau Oberösterreich', '4'): 12321.88, ('20143', 'Wirtschaftskammer Niederösterreich Fachgruppe Unternehmensberatung und Informationstechnologie NÖ', 'trend', '31'): 5700.53, ('20144', 'Wirtschaftskammer Tirol - Sparte Gewerbe und Handwerk Innung der Lebensmittelgewerbe', 'ORF Radio Tirol', '4'): 5861.56, ('20144', 'Bundesministerium für Land- und Forstwirtschaft Umwelt und Wasserwirtschaft', 'Weekend Magazin', '2'): 17355.1, ('20144', 'Bundesministerium für Land- und Forstwirtschaft Umwelt und Wasserwirtschaft', 'Woman', '2'): 12911.5, ('20144', 'Bundesministerium für Wissenschaft Forschung und Wirtschaft', 'Die Presse', '31'): 30965.4, ('20143', 'Bundesministerium für Europa Integration und Äußeres', 'Kronen Zeitung', '4'): 52490.46,.......)}

I want to to sum all the values where the key has the same number in front (for example 20144 ) and the same number at the end ( 2 or 31 for example). 我想要将密钥具有相同数字的所有值（例如20144 ）和结尾处的相同数字（例如2或31 ） 20144 。

I thought about a dict comprehension but I am struggling with comparing the keys I need. 我想到了一个字典理解，但我正在努力比较我需要的键。 How can I easily compare them? 我怎样才能轻松比较它们？

Answer 1

Solution: 解：

trimmed={}
for k,v in data.items(): 
    trimmed.setdefault((k[0],k[-1]),[]).append(v)

{k:sum(v) for k,v in trimmed.items()}

Output: 输出：

{('20144', '4'): 18183.44, ('20144', '31'): 30965.4, ('20143', '31'): 5700.53, ('20144', '2'): 30266.6, ('20143', '4'): 52490.46}

Given your example, this is what trimmed looks like after the for loop: 举个例子，这就是在for循环之后trimmed样子：

{('20144', '4'): [12321.88, 5861.56], ('20144', '31'): [30965.4], ('20143', '4'): [52490.46], ('20144', '2'): [12911.5, 17355.1], ('20143', '31'): [5700.53]}

Explanation: 说明：

The for loop is looping through your data keys ( k ) and key values ( v ) in your sample data. for循环遍历示例数据中的数据键（ k ）和键值（ v ）。 If the key (k[0],k[-1]) (ie the first and last values in your key tuples-- ('20144','4'): for instance) does not exist in the new dictionary trimmed then a blank list is created and the value ( v ) is appended. 如果键(k[0],k[-1]) （即你的键元组中的第一个和最后一个值 - ('20144','4'):例如）在trimmed的新词典中不存在创建一个空白列表并附加值（ v ）。 If the key does exist then the value is simply appended. 如果密钥确实存在，则仅附加值。

After the trimmed dictionary is completed then a simple dictionary comprehension sums all these lists. trimmed后的字典完成后，简单的字典理解将所有这些列表相加。

Edit: 编辑：

As pointed out in the comments you can also use defaultdict from collections if performance is an issue: 正如评论中所指出的，如果性能存在问题，您还可以使用collections defaultdict ：

from collections import defaultdict

trimmed=defaultdict(float)
for k,v in data.items(): 
    trimmed[(k[0],k[-1])]+=v

And here the values are stored in trimmed . 这里的值存储在trimmed 。 A newly initialized key in the trimmed defaultdict will be 0.0 . trimmed defaultdict中新初始化的键将为0.0 。 Then you can just add in place v . 然后你可以添加到位v 。

Answer 2

This can get you the results you are looking for: 这可以为您提供所需的结果：

dict = {('20144', 'Bundesministerium f\xc3\xbcr Land- und Forstwirtschaft Umwelt und Wasserwirtschaft', 'Woman', '2'): 12911.5, ('20144', 'Wirtschaftskammer Tirol - Sparte Gewerbe und Handwerk Innung der Lebensmittelgewerbe', 'ORF Radio Tirol', '4'): 5861.56, ('20144', 'Bundesministerium f\xc3\xbcr Land- und Forstwirtschaft Umwelt und Wasserwirtschaft', 'Weekend Magazin', '2'): 17355.1, ('20144', 'Bundesministerium f\xc3\xbcr Wissenschaft Forschung und Wirtschaft', 'Die Presse', '31'): 30965.4, ('20144', 'Wirtschaftskammer \xc3\x96sterreich Fachverband der Telekommunikations- und Rundfunkunternehmungen', 'Bezirksrundschau Ober\xc3\xb6sterreich', '4'): 12321.88, ('20143', 'Wirtschaftskammer Nieder\xc3\xb6sterreich Fachgruppe Unternehmensberatung und Informationstechnologie N\xc3\x96', 'trend', '31'): 5700.53, ('20143', 'Bundesministerium f\xc3\xbcr Europa Integration und \xc3\x84u\xc3\x9feres', 'Kronen Zeitung', '4'): 52490.46}
sum_by_key = {}
filter_obj = None
for key, value in dict.items():
  sum_key = (key[0], key[-1])
  if sum_key in sum_by_key:
    sum_by_key[sum_key] += value
  else:
    sum_by_key[sum_key] = value

The output: 输出：

{('20144', '2'): 30266.6, ('20143', '31'): 5700.53, ('20144', '31'): 30965.4, ('20144', '4'): 18183.44, ('20143', '4'): 52490.46}

Answer 3

You can use itertools.groupby . 你可以使用itertools.groupby 。 See if the following code suits you (I used d as your dict). 看看下面的代码是否适合你（我用d作为你的dict）。

Edit: dict needed to be sorted 编辑：dict需要排序

fields = lambda k: (k[0], k[3])
for k, i in itertools.groupby(sorted(d, key=fields), key=fields):
    ...:     print(k, sum(d[v] for v in i))

('20143', '31') 5700.53
('20143', '4') 524.23
('20144', '2') 30266.6
('20144', '31') 30965.4
('20144', '4') 18183.44

Answer 4

Here's how it can be done in one pass, taking advantage of defaultdict from the standard library: 以下是利用标准库中的defaultdict一次完成的方法：

import collections
output_dict = collections.defaultdict(float)
for key, value in input_dict.items():
    output_dict[ (key[0], key[-1]) ] += value


# show the output
print('\n'.join('%r: %r' % (key,value) for key, value in output_dict.items()))

Prints as follows: 打印如下：

('20144', '2'): 30266.6
('20143', '31'): 5700.53
('20144', '31'): 30965.4
('20144', '4'): 18183.44
('20143', '4'): 52490.46

如果有一个类似的键（元组），Python汇总一个字典的元素

问题描述

4 个解决方案

解决方案1
1 已采纳 2016-12-16 16:32:18

解决方案2
1 2016-12-16 18:14:50

解决方案3
0 2016-12-16 16:26:47

解决方案4
0 2016-12-16 17:14:10

如果有一个类似的键（元组），Python汇总一个字典的元素

问题描述

4 个解决方案

解决方案1 1 已采纳 2016-12-16 16:32:18

解决方案2 1 2016-12-16 18:14:50

解决方案3 0 2016-12-16 16:26:47

解决方案4 0 2016-12-16 17:14:10

解决方案1
1 已采纳 2016-12-16 16:32:18

解决方案2
1 2016-12-16 18:14:50

解决方案3
0 2016-12-16 16:26:47

解决方案4
0 2016-12-16 17:14:10