简体   繁体   English

如何从具有多个对象的python字典中获取重复对象和相应键的总数?

[英]How to get total number of repeated objects and respective keys from a python dictionary having multiple objects?

I have a python dictionary which consists of many nested dictionaries. 我有一个python字典,它由许多嵌套字典组成。 Ie it looks like this: 即它看起来像这样:

result = {
    123: {
       'route1': 'abc'
       'route2': 'abc1'
        },
    456: {
       'route1': 'abc'
       'route2': 'abc1'
        },
    789: {
       'route1': 'abc2'
       'route2': 'abc3'
        },
    101: {
       'route1': 'abc'
       'route2': 'abc1'
        },
    102: {
       'route1': 'ab4'
       'route2': 'abc5'
        }

} }

Here we can see that 123 , 456 and 101 has the same values. 在这里我们可以看到, 123456101具有相同的价值观。 What I am trying to do that is to find out the repeated object which in this case is: 我想要做的是找出重复的对象,在这种情况下是:

{
   'route1': 'abc'
    'route2': 'abc1'
 }

and the keys which have this repeated object ie 123 , 456 and 101 . 并且其具有即此重复对象的按键123456101 How can we do this? 我们应该怎么做?

Along with repeated objects info, I also want to know which objects does not repeat. 除了重复的对象信息,我还想知道哪些对象不重复。 Ie 789 and its respective object and 102 and its respective object. 789及其各自的对象和102及其各自的对象。

PS: Please note that I don't really know beforehand which objects are repeating as this structure will be generated inside code. PS:请注意我事先并不知道哪些对象正在重复,因为这个结构将在代码中生成。 So, it's possible that there could not be any repeated object or there could be multiple ie more than one. 因此,可能没有任何重复的对象,或者可能存在多个,即多个。 Also, I can not use pandas or numpy etc. due to some restrictions. 此外,由于一些限制,我不能使用pandasnumpy等。

Use collections.defaultdict : 使用collections.defaultdict

from collections import defaultdict

d = defaultdict(list)
for k, v in result.items():
    d[tuple(v.items())].append(k)

desired = {
   'route1': 'abc',
    'route2': 'abc1'
 }
d[tuple(desired.items())]

Output: 输出:

[456, 123, 101]

For not-repeated items, use list comprehension: 对于不重复的项目,请使用列表理解:

[v for v in d.values() if len(v) == 1]

Output: 输出:

[[102], [789]]

You can use drop_duplicates() function of pandas : 您可以使用drop_duplicates()的函数pandas

Firstly transforme your dict on dataframe 首先在数据框架上转换你的dict

import pandas as pd `

df = pd.DataFrame(result).T

Output : 输出:

    route1  route2
123 abc     abc1
456 abc     abc1
789 abc2    abc3
101 abc     abc1
102 ab4     abc5

Then use the function drop_duplicates and transform to a dict 然后使用函数drop_duplicates并转换为dict

df2 = df1.drop_duplicates(subset=['route1', 'route2']).T.to_dict()

Output : 输出:

{
 123: {
       'route1': 'abc', 
       'route2': 'abc1'
      },
 789: {
       'route1': 'abc2',
       'route2': 'abc3'
      },
 102: {
       'route1': 'ab4', 
       'route2': 'abc5'
      }
}

You can do this by creating a dictionary holding all the matching keys for each distinct value in your result dict (where the values are themselves dicts). 您可以通过创建一个字典来完成此操作,该字典包含result字典中每个不同值的所有匹配键(其中值本身就是dicts)。 This is a fairly common pattern in Python, iterating through one container and aggregating values into a dict. 这是Python中相当常见的模式,迭代一个容器并将值聚合到一个dict中。 Then, once you've created the aggregation dict, you can split it into duplicate and single values. 然后,一旦创建了聚合字典,就可以将其拆分为重复值和单个值。

To build the aggregation dict, you need to use each subdict from result as a key and append the matching keys from the original dict to a list associated with that subdict. 要构建聚合dict,您需要将result中的每个子句用作键,并将原始dict中的匹配键附加到与该子句相关联的列表中。 The challenge is that you can't use the subdicts directly as dictionary keys, because they are not hashable. 挑战在于您不能直接将子区域用作字典键,因为它们不可清除。 But you can solve that by converting them to tuples. 但是你可以通过将它们转换为元组来解决这个问题。 The tuples should also be sorted, to avoid missing duplicates that happen to pop out with different ordering. 还应对元组进行排序,以避免丢失重复序列,这些重复序列会以不同的顺序弹出。

It may be easier to understand just by looking at some example code: 通过查看一些示例代码可能更容易理解:

result = {
    123: {'route1': 'abc', 'route2': 'abc1'},
    456: {'route1': 'abc', 'route2': 'abc1'},
    789: {'route1': 'abc2', 'route2': 'abc3'},
    101: {'route1': 'abc', 'route2': 'abc1'},
    102: {'route1': 'ab4', 'route2': 'abc5'}
}

# make a dict showing all the keys that match each subdict
cross_refs = dict()
for key, subdict in result.items():
    # make hashable version of subdict (can't use dict as lookup key)
    subdict_tuple = tuple(sorted(subdict.items()))
    # create an empty list of keys that match this val
    # (if needed), or retrieve existing list
    matching_keys = cross_refs.setdefault(subdict_tuple, [])
    # add this item to the list
    matching_keys.append(key)

# make lists of duplicates and non-duplicates
dups = {}
singles = {}
for subdict_tuple, keys in cross_refs.items():
    # convert hashed value back to a dict
    subdict = dict(subdict_tuple)
    if len(keys) > 1:
        # convert the list of matching keys to a tuple and use as the key
        dups[tuple(keys)] = subdict
    else:
        # there's only one matching key, so use that as the key
        singles[keys[0]] = subdict

print(dups)
# {
#     (456, 123, 101): {'route2': 'abc1', 'route1': 'abc'}
# }
print(singles)
# {
#     789: {'route2': 'abc3', 'route1': 'abc2'}, 
#     102: {'route2': 'abc5', 'route1': 'ab4'}
# }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 字典中有多个对象 - Having multiple objects in a dictionary 如何将字符串转换为包含两个具有各自值的键的字典对象列表? - How to convert the string to a list of dictionary objects containing two keys with respective values? Python字典键(它是类对象)与多个比较器进行比较 - Python dictionary keys(which are class objects) comparison with multiple comparer 如何将值从 Python Pandas 中的多个字典对象插入到 DataFrames 中 - How to insert values into DataFrames from multiple dictionary objects in Python Pandas 如何防止函数向python中的字典添加重复键 - How to prevent a function from adding repeated keys to a dictionary in python 如何在Python中按多个键对对象进行排序? - How to sort objects by multiple keys in Python? 如何从 Python 字典中的相同值获取多个键 - How to get multiple keys from same value in Python dictionary 通过for循环从多个对象向字典添加键和值 - add keys and values to a dictionary from multiple objects by for loop Python:具有一个键和多个值的字典:如何获取具有相同SET值的键列表? - Python : Dictionary with one key and multiple values : How to get list of keys having same SET of values? Python - 如何从另一个列表创建一个字典列表,该列表具有两个具有多个值的键? - Python - How to create a dictionary list from another list which having two keys with multiple values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM