I have a python dictionary which consists of many nested dictionaries. Ie it looks like this:
result = {
123: {
'route1': 'abc'
'route2': 'abc1'
},
456: {
'route1': 'abc'
'route2': 'abc1'
},
789: {
'route1': 'abc2'
'route2': 'abc3'
},
101: {
'route1': 'abc'
'route2': 'abc1'
},
102: {
'route1': 'ab4'
'route2': 'abc5'
}
}
Here we can see that 123
, 456
and 101
has the same values. What I am trying to do that is to find out the repeated object which in this case is:
{
'route1': 'abc'
'route2': 'abc1'
}
and the keys which have this repeated object ie 123
, 456
and 101
. How can we do this?
Along with repeated objects info, I also want to know which objects does not repeat. Ie 789
and its respective object and 102
and its respective object.
PS: Please note that I don't really know beforehand which objects are repeating as this structure will be generated inside code. So, it's possible that there could not be any repeated object or there could be multiple ie more than one. Also, I can not use pandas
or numpy
etc. due to some restrictions.
Use collections.defaultdict
:
from collections import defaultdict
d = defaultdict(list)
for k, v in result.items():
d[tuple(v.items())].append(k)
desired = {
'route1': 'abc',
'route2': 'abc1'
}
d[tuple(desired.items())]
Output:
[456, 123, 101]
For not-repeated items, use list comprehension:
[v for v in d.values() if len(v) == 1]
Output:
[[102], [789]]
You can use drop_duplicates()
function of pandas
:
Firstly transforme your dict on dataframe
import pandas as pd `
df = pd.DataFrame(result).T
Output :
route1 route2
123 abc abc1
456 abc abc1
789 abc2 abc3
101 abc abc1
102 ab4 abc5
Then use the function drop_duplicates
and transform to a dict
df2 = df1.drop_duplicates(subset=['route1', 'route2']).T.to_dict()
Output :
{
123: {
'route1': 'abc',
'route2': 'abc1'
},
789: {
'route1': 'abc2',
'route2': 'abc3'
},
102: {
'route1': 'ab4',
'route2': 'abc5'
}
}
You can do this by creating a dictionary holding all the matching keys for each distinct value in your result
dict (where the values are themselves dicts). This is a fairly common pattern in Python, iterating through one container and aggregating values into a dict. Then, once you've created the aggregation dict, you can split it into duplicate and single values.
To build the aggregation dict, you need to use each subdict from result
as a key and append the matching keys from the original dict to a list associated with that subdict. The challenge is that you can't use the subdicts directly as dictionary keys, because they are not hashable. But you can solve that by converting them to tuples. The tuples should also be sorted, to avoid missing duplicates that happen to pop out with different ordering.
It may be easier to understand just by looking at some example code:
result = {
123: {'route1': 'abc', 'route2': 'abc1'},
456: {'route1': 'abc', 'route2': 'abc1'},
789: {'route1': 'abc2', 'route2': 'abc3'},
101: {'route1': 'abc', 'route2': 'abc1'},
102: {'route1': 'ab4', 'route2': 'abc5'}
}
# make a dict showing all the keys that match each subdict
cross_refs = dict()
for key, subdict in result.items():
# make hashable version of subdict (can't use dict as lookup key)
subdict_tuple = tuple(sorted(subdict.items()))
# create an empty list of keys that match this val
# (if needed), or retrieve existing list
matching_keys = cross_refs.setdefault(subdict_tuple, [])
# add this item to the list
matching_keys.append(key)
# make lists of duplicates and non-duplicates
dups = {}
singles = {}
for subdict_tuple, keys in cross_refs.items():
# convert hashed value back to a dict
subdict = dict(subdict_tuple)
if len(keys) > 1:
# convert the list of matching keys to a tuple and use as the key
dups[tuple(keys)] = subdict
else:
# there's only one matching key, so use that as the key
singles[keys[0]] = subdict
print(dups)
# {
# (456, 123, 101): {'route2': 'abc1', 'route1': 'abc'}
# }
print(singles)
# {
# 789: {'route2': 'abc3', 'route1': 'abc2'},
# 102: {'route2': 'abc5', 'route1': 'ab4'}
# }
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.