简体   繁体   English

在Python中,计算字典中的唯一键/值对

[英]In Python, count unique key/value pairs in a dictionary

I have a dictionary that is made with a list of values. 我有一本用值列表制成的字典。 Some of these values are also keys or values in other key/value pairs in the dictionary. 这些值中的一些也是键或字典中其他键/值对中的值。 I would simply like to count how many of these unique pairs there are in the dictionary. 我只想计算一下字典中有多少对这些唯一对。

Ex. 例如 dict = {'dog':['milo','otis','laurel','hardy'],'cat':['bob','joe'],'milo':['otis','laurel','hardy','dog'],'bob':['cat','joe'],'hardy':['dog']}

I need to count the number of key/value pairs that do not have share a key/value with another in the dict. 我需要计算字典中没有与另一个共享键/值的键/值对的数量。 For example the above should count to only 2, those connected to dog and cat. 例如,上面应仅计为2,与狗和猫有关的那些。 Even though milo is unique to dog, dog is also in the key/value pair 'hardy' and both of these should therefore be counted together (ie, only 1). 即使milo对于dog而言是唯一的,dog也位于键/值对“ hardy”中,因此应将它们一起计算(即,仅1)。 (See comments below) I have tried to go about it by replacing a key (key A) that exists in the values of another key (key B) with 'key B', without success however as I cannot specify key B correctly. (请参阅下面的评论)我试图通过用“键B”替换另一个键(键B)的值中存在的键(键A)来解决此问题,但是没有成功,因为我无法正确指定键B。

for keys, values in dict.iteritems():

    for key,value in dict.iteriterms():
            if key in values:
                dict[keys] = dict.pop(key)

Is there an easier method? 有没有更简单的方法? Thanks in advance... 提前致谢...

If I understand the problem correctly, your dictionary is the adjacency map of a graph and you're trying to find the sets of connected components . 如果我正确理解问题,那么您的字典就是图的邻接图,并且您正在尝试查找连接的组件集。 The regular algorithm (using a depth- or breadth-first search) may not work correctly since your graph is not undirected (eg you have edges from "bob" and "cat" to "joe" , but none coming out from "joe" ). 常规算法(使用深度或广度优先搜索)可能无法正常工作,因为您的图形不是无向的(例如,您的边缘从"bob""cat""joe" ,但没有一条从"joe" )。

Instead, I suggest using a disjoint set data structure . 相反,我建议使用不相交的集合数据结构 It's not hard to build one using a dictionary to handle the mapping of values to parents. 使用字典来构建值到父母的映射并不难。 Here's an implementation I wrote for a previous question: 这是我为上一个问题编写的实现:

class DisjointSet:
    def __init__(self):
        self.parent = {}
        self.rank = {}

    def find(self, element):
        if element not in self.parent: # leader elements are not in `parent` dict
            return element
        leader = self.find(self.parent[element]) # search recursively
        self.parent[element] = leader # compress path by saving leader as parent
        return leader

    def union(self, leader1, leader2):
        rank1 = self.rank.get(leader1,0)
        rank2 = self.rank.get(leader2,0)

        if rank1 > rank2: # union by rank
            self.parent[leader2] = leader1
        elif rank2 > rank1:
            self.parent[leader1] = leader2
        else: # ranks are equal
            self.parent[leader2] = leader1 # favor leader1 arbitrarily
            self.rank[leader1] = rank1+1 # increment rank

And here's how you could use it to solve your problem: 这是使用它来解决问题的方法:

djs = DisjointSet()
all_values = set()
for key, values in my_dict.items():
    all_values.add(key)
    all_values.update(values)
    for val in values:
        l1 = djs.find(key)
        l2 = djs.find(val)
        if l1 != l2:
            djs.union(l1, l2)

roots = {djs.find(x) for x in all_values}
print("The number of disjoint sets is:", len(roots))

The first part of this code does two things. 该代码的第一部分做了两件事。 First it builds a set with all the unique nodes found anywhere in the graph. 首先,它构建一个集合,其中包含在图中任何位置找到的所有唯一节点。 Secondly, it combines the nodes into disjoint sets by doing a union wherever there's an edge. 其次,它通过在有边缘的地方进行并集,将节点组合成不相交的集合。

The second step is to build up a set of "root" elements from the disjoint set. 第二步是从不相交集中构建一组“根”元素。

Here is one possible solution: 这是一种可能的解决方案:

values = {'dog':['milo','otis','laurel','hardy'],
          'cat':['bob','joe'],
          'milo':['otis','laurel','hardy','dog'],
          'bob':['cat','joe'],
          'hardy':['dog']}

result = []

for x in values.iteritems():
    y = set([x[0]] + x[1])
    if not any([z for z in result if z.intersection(y)]):
        result.append(y)

print len(result)

Note that you shouldn't call a variable dict because you're shadowing the built-in type dict . 请注意,您不应调用变量dict因为您将隐藏内置类型dict

Your goal is unclear, but you can modify the construction of the y set to meet your needs. 您的目标尚不明确,但是您可以修改y set的构造以满足您的需求。

If I understand your question correctly, you are trying to describe a graph-like structure, and you're looking at whether the keys appear in a value list. 如果我正确理解了您的问题,那么您正在尝试描述类似图形的结构,并且正在查看键是否出现在值列表中。 Since you are only interested in count, you don't have to worry about future value lists, when iterating through the dict, so this should work: 由于您只对计数感兴趣,因此在遍历dict时不必担心将来的值列表,因此这应该可行:

d = {'dog': ['milo','otis','laurel','hardy'],'cat': ['bob','joe'],'milo': 'otis','laurel','hardy','dog'], 'bob': ['cat','joe'], 'hardy': ['dog']}
seen = set()
unique = []
for key, values in d.iteritems():
    if key not in seen:
        unique.append(key)
    seen = seen.union(values)
print(len(unique))

Note that the actual values contained in unique are dependent on dict ordering, are are only keys, not values. 注意, unique中包含的实际值取决于dict的顺序,仅是键,而不是值。 If you are actually trying to some sort of network or graph analysis, I suggest you make use of a library such as networkx 如果您实际上正在尝试某种形式的网络或图形分析,建议您使用诸如networkx之类的库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM