繁体   English   中英

如果它们共享任何键值对,如何合并来自单独列表的多个词典?

[英]How to merge multiple dictionaries from separate lists if they share any key-value pairs?

如果它们共享一个公用的键值对,如何合并多个列表中的字典?

例如,这是三个字典列表:

l1 = [{'fruit':'banana','category':'B'},{'fruit':'apple','category':'A'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'2','type':'old'},{'order':'1','type':'new'}]

所需结果:

l = [{'fruit':'apple','category':'A','order':'1','type':'new'},{'fruit':'banana','category':'B','order':'2','type':'old'}]

棘手的部分是我希望此函数仅将列表作为参数而不是键作为参数,因为我只想插入任意数量的字典列表,而不必担心哪个键名是重叠的(在这种情况下,它们将所有三个键名组合在一起的键名是“ category”和“ type”)。

我应该注意,索引无关紧要,因为它仅应基于公共元素。

这是我的尝试:

def combine_lists(*args):
    base_list = args[0]
    L = []
    for sublist in args[1:]:
        L.extend(sublist)
    for D in base_list:
        for Dict in L:
            if any([tup in Dict.items() for tup in D.items()]): 
                D.update(Dict)
    return base_list

对于此问题,将dict视为元组列表很方便:

In [4]: {'fruit':'apple','category':'A'}.items()
Out[4]: [('category', 'A'), ('fruit', 'apple')]

由于我们希望连接共享键值对的字典,因此我们可以将每个元组视为图中的一个节点,将元组对视为边。 有了图形后,问题就会减少到找到图形的连接组件。

使用networkx

import itertools as IT
import networkx as nx

l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]

data = [l1, l2, l3]
G = nx.Graph()
for dct in IT.chain.from_iterable(data):
    items = list(dct.items())
    node1 = node1[0]
    for node2 in items:
        G.add_edge(node1, node22)

for cc in nx.connected_component_subgraphs(G):
    print(dict(IT.chain.from_iterable(cc.edges())))

产量

{'category': 'A', 'fruit': 'apple', 'type': 'new', 'order': '1'}
{'category': 'B', 'fruit': 'banana', 'type': 'old', 'order': '2'}

如果您希望删除networkx依赖关系,则可以使用例如pillmuncher的实现

import itertools as IT

def connected_components(neighbors):
    """
    https://stackoverflow.com/a/13837045/190597 (pillmuncher)
    """
    seen = set()
    def component(node):
        nodes = set([node])
        while nodes:
            node = nodes.pop()
            seen.add(node)
            nodes |= neighbors[node] - seen
            yield node
    for node in neighbors:
        if node not in seen:
            yield component(node)

l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]

data = [l1, l2, l3]
G = {}
for dct in IT.chain.from_iterable(data):
    items = dct.items()
    node1 = items[0]
    for node2 in items[1:]:
        G.setdefault(node1, set()).add(node2)
        G.setdefault(node2, set()).add(node1)

for cc in connected_components(G):
    print(dict(cc))

打印与上面相同的结果。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM