合并列表中有字典的多个字典

Question

I have several dictionaries (perhaps 10s of them) that formed like below:我有几个字典（可能有 10 个），形成如下：

{'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 135},
            {'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
            {'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 1},
            {'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1},
            {'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
 'stderr': ''}

I want to combine all those dictionaries with adding 'count' key's integer with same 'foo','bar' and 'host' keys (None is NoneType)我想将所有这些字典与添加 'count' 键的 integer 与相同的 'foo'、'bar' 和 'host' 键相结合（None 是 NoneType）

For example, for 2 dictionaries例如，对于 2 个字典

dictA = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 135},
            {'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
            {'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 2},
            {'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1},
            {'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
 'stderr': ''}

dictB = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 280},
            {'foo': 'A', 'bar': 'B', 'host': 'orange', 'count': 46},
            {'foo': 'A', 'bar': 'E', 'host': 'pineapple', 'count': 3},
            {'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 2},
            {'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 1}],
 'stderr': ''}

Then the merged version should be那么合并后的版本应该是

dictMerged = {'stdout': [{'foo': 'A', 'bar': 'B', 'host': None, 'count': 415},
            {'foo': 'A', 'bar': 'B', 'host': 'orange', 'count': 46},
            {'foo': 'C', 'bar': 'B', 'host': 'egg', 'count': 28},
            {'foo': 'D', 'bar': 'E', 'host': 'apple', 'count': 4},
            {'foo': 'A', 'bar': 'E', 'host': 'pineapple', 'count': 3},
            {'foo': 'C', 'bar': 'F', 'host': 'carrot', 'count': 2},
            {'foo': 'A', 'bar': 'E', 'host': 'chicken breast', 'count': 1}],
 'stderr': ''}

Note that the dictionary elements in list's order changed after 'count' summed.请注意，列表顺序中的字典元素在 'count' 相加后发生了变化。

I have tried to combine them for same 'host' as a first step like below but it was not same as what I wanted:我已经尝试将它们组合为相同的“主机”，如下所示，但它与我想要的不同：

hostname1 = {i["host"]: i for i in dictA['stdout']}
hostname2 = {i["host"]: i for i in dictB['stdout']}
all_host = hostname1|hostname2
{key: value + b[key] for key, value in a.items()}

Answer 1

One approach一种方法

from collections import defaultdict
from operator import itemgetter

# creat a dictionary (defaultdict) to put the dictionaries with matching foo, bar, host in the same list
groups = defaultdict(list, {(d['foo'], d['bar'], d['host']): [d] for d in dictB['stdout']})
for d in dictA["stdout"]:
    key = (d['foo'], d['bar'], d['host'])
    groups[key].append(d)

# use item getter for better readability
count = itemgetter("count")

# create new list of dictionaries, sum the count values
ds = [{'foo': f, 'bar': b, 'host': h, 'count': sum(count(d) for d in v)} for (f, b, h), v in groups.items()]

# sort the list of dictionaries in decreasing order 
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}
print(res)

Output Output

{'stderr': '',
 'stdout': [{'bar': 'B', 'count': 415, 'foo': 'A', 'host': None},
            {'bar': 'B', 'count': 46, 'foo': 'A', 'host': 'orange'},
            {'bar': 'B', 'count': 28, 'foo': 'C', 'host': 'egg'},
            {'bar': 'E', 'count': 4, 'foo': 'D', 'host': 'apple'},
            {'bar': 'E', 'count': 3, 'foo': 'A', 'host': 'pineapple'},
            {'bar': 'F', 'count': 2, 'foo': 'C', 'host': 'carrot'},
            {'bar': 'E', 'count': 1, 'foo': 'A', 'host': 'chicken breast'}]}

For more on each of the functions and data structures used in the code above see: sorted , defaultdict and itemgetter有关上述代码中使用的每个函数和数据结构的更多信息，请参阅： sorted 、 defaultdict和itemgetter

One alternative一种选择

Use groupby :使用groupby ：

import pprint
from operator import itemgetter
from itertools import groupby


def key(d):
    return d["foo"], d["bar"], d["host"] or ""


count = itemgetter("count")
lst = sorted(dictA["stdout"] + dictB["stdout"], key=key)
groups = groupby(lst, key=key)
ds = [{'foo': f, 'bar': b, 'host': h or None, 'count': sum(count(d) for d in vs)} for (f, b, h), vs in groups]
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}
print(res)

This second approach has two caveats:第二种方法有两个警告：

The time complexity is O(nlogn) the first one was O(n)时间复杂度是O(nlogn)第一个是O(n)
In order to sort the list of dictionaries it needs to replace None by the empty string "" .为了对字典列表进行排序，它需要将None替换为空字符串"" 。

Multiple dictionaries多个词典

If you have multiple dictionaries you can change the first approach to:如果您有多个字典，您可以将第一种方法更改为：

# create a dictionary (defaultdict) to put the dictionaries with matching foo, bar, host in the same list
groups = defaultdict(list, {(d['foo'], d['bar'], d['host']): [d] for d in dictB['stdout']})

# create a list with all the dictionaries from multiple dict
data = []
lst = [dictA]  # change this line to contain all the dictionaries except B
for d in lst:
    data.extend(d["stdout"])

for d in data:
    key = (d['foo'], d['bar'], d['host'])
    groups[key].append(d)

# use item getter for better readability
count = itemgetter("count")

# create new list of dictionaries, sum the count values
ds = [{'foo': f, 'bar': b, 'host': h, 'count': sum(count(d) for d in v)} for (f, b, h), v in groups.items()]

# sort the list of dictionaries in decreasing order
res = {"stdout": sorted(ds, key=count, reverse=True), "stderr": ""}

What is `itemgetter` ?什么是`itemgetter` ？

From the documentation:从文档中：

Return a callable object that fetches item from its operand using the operand's getitem () method.返回一个可调用的 object，它使用操作数的getitem () 方法从其操作数中获取项目。 If multiple items are specified, returns a tuple of lookup values.如果指定了多个项目，则返回查找值的元组。

Is equivalent to:相当于：

def itemgetter(*items):
    if len(items) == 1:
        item = items[0]
        def g(obj):
            return obj[item]
    else:
        def g(obj):
            return tuple(obj[item] for item in items)
    return g

合并列表中有字典的多个字典

问题描述

1 个解决方案

解决方案1
4 2022-07-28 07:04:01

One approach一种方法

One alternative一种选择

Multiple dictionaries多个词典

What is `itemgetter` ?什么是`itemgetter` ？

合并列表中有字典的多个字典

问题描述

1 个解决方案

解决方案1 4 2022-07-28 07:04:01

One approach一种方法

One alternative一种选择

Multiple dictionaries多个词典

What is itemgetter ?什么是itemgetter ？

解决方案1
4 2022-07-28 07:04:01

What is `itemgetter` ?什么是`itemgetter` ？