简体   繁体   English

如果某些键相同,则添加字典的其他值

[英]Add the other values of a dictionary if certain keys are the same

This is my input.这是我的输入。 I have a list of dictionaries:我有一个字典列表:

[{'name1':'a', 'name2':'b','val1':10,'val2':20},
 {'name1':'a', 'name2':'b','val1':15,'val2':25},
 {'name1':'r', 'name2':'s','val1':30,'val2':20}] 

If the keys name1 and name2 have both the same value, then add val1 and val2 .如果键name1name2具有相同的值,则添加val1val2

Here is the expected output:这是预期的 output:

[{'name1':'a', 'name2':'b','val1':25,'val2':45},
 {'name1':'r', 'name2':'s','val1':30,'val2':20}] 

In the first dict and second dict, both name1 is a and both name2 is b , so we add their values.在第一个字典和第二个字典中, name1都是a并且name2都是b ,所以我们添加它们的值。

I was trying with loop but was not getting anywhere.我正在尝试使用循环,但没有得到任何结果。

You can use collections.Counter and itertools.groupby :您可以使用collections.Counteritertools.groupby

>>> dicts = [{'name1':'a', 'name2':'b','val1':10,'val2':20},
 {'name1':'a', 'name2':'b','val1':15,'val2':25},
 {'name1':'r', 'name2':'s','val1':30,'val2':20}] 
>>> new_dicts = []
>>> for k, groups in groupby(dicts, lambda d: (d.pop('name1'), d.pop('name2'))):
        new_d = {
             'name1': k[0], 
             'name2': k[1], 
             **sum([Counter(g) for g in groups], Counter())
            }
        new_dicts.append(new_d)

>>> new_dicts
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
 {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

On the other hand, if you use pandas :另一方面,如果您使用pandas

>>> pd.DataFrame(dicts).groupby(['name1', 'name2']).sum().reset_index().to_dict('r')
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
 {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

If you want to do this without modules, you can try:如果您想在没有模块的情况下执行此操作,可以尝试:

>>> new_dicts = []
>>> for d in dicts:
        if not new_dicts:
            new_dicts.append(d)
        else:
            last_dict = new_dicts[-1]
            if (last_dict['name1'], last_dict['name2']) == (d['name1'], d['name2']):
                last_dict['val1'] += d['val1']
                last_dict['val2'] += d['val2']
            else:
                new_dicts.append(d)
>>> new_dicts
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
 {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

NOTE :注意

First and third solution assume that your list is sorted, ie same name1 name2 entries will appear consecutively, if that is not the case, you can add this line at the beginning:第一个和第三个解决方案假设您的列表已排序,即相同的name1 name2条目将连续出现,如果不是这种情况,您可以在开头添加这一行:

>>> dicts = sorted(dicts, key=lambda x: (x['name1'], x['name2']))

You can just iterate and use an intermediate dictionary where (name1, name2) is the key to achieve linear time time complexity.您可以迭代并使用中间字典,其中(name1, name2)是实现线性时间复杂度的关键。

>>> for d in l:
...     name1, name2, val1, val2 = d['name1'], d['name2'], d['val1'], d['val2']
...     if (name1, name2) in res:
...             res[(name1, name2)] = res[(name1, name2)][0] + val1, res[(name1, name2)][1] + val2
...     else:
...             res[(name1, name2)] = (val1, val2)
... 
>>> res
{('a', 'b'): (25, 45), ('r', 's'): (30, 20)}
>>> output = [{'name1': k[0], 'name2': k[1], 'val1': v[0], 'val2': v[1]} for k,v in res.items()]
>>> output
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45}, {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

Run it through pandas, which is keenly good at this type of stuff.通过 pandas 运行它,它非常擅长这类东西。 (and yes, this could probably be collapsed down to 1 or 2 chained statements.: (是的,这可能会被折叠成 1 或 2 个链式语句。:

In [37]: a                                                                                    
Out[37]: 
[{'name1': 'a', 'name2': 'b', 'val1': 10, 'val2': 20},
 {'name1': 'a', 'name2': 'b', 'val1': 15, 'val2': 25},
 {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

In [38]: df =  pd.DataFrame(a)                                                                

In [39]: df                                                                                   
Out[39]: 
  name1 name2  val1  val2
0     a     b    10    20
1     a     b    15    25
2     r     s    30    20

In [40]: grouped_sum = df.groupby(['name1', 'name2']).sum()                                   

In [41]: grouped_sum                                                                          
Out[41]: 
             val1  val2
name1 name2            
a     b        25    45
r     s        30    20

In [42]: grouped_sum.reset_index(inplace=True)                                                

In [43]: data = grouped_sum.to_dict('records')                                                

In [44]: data                                                                                 
Out[44]: 
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
 {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

I suggest you to post the code you tried and then ask for help, so others can help by suggesting some changes.我建议您发布您尝试过的代码,然后寻求帮助,以便其他人可以通过提出一些更改来提供帮助。 But something like this can help you,但是这样的事情可以帮助你,

di = [{'name1': 'a', 'name2': 'a', 'val1': 10, 'val2': 20},
      {'name1': 'a', 'name2': 'b', 'val1': 15, 'val2': 25},
      {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]

for i in di:
    if i['name1'] == i['name2']:
        print("sum:", i['val1']+i['val2'])

It prints the sum of val1 and val2 if name1 amd name2 are equal.如果 name1 和 name2 相等,它会打印 val1 和 val2 的总和。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM