简体   繁体   中英

Finding duplicated values in a dictionary in Python

I am trying to write a script finding duplicated values in a dictionary. My dictionary is having an integer key and a list as a value:

{5: ['13139', '3219', '3'], 6: ['14072', '3214', '3'], 7: ['13137', '3219', '3'], 8: ['13141', '3219', '3'], 9: ['13139', '3219', '3']}

Here is my code:

for key, value in dict.iteritems():
                for other_key, other_value in dict.iteritems():
                    if value == other_value and key != other_key:
                        print "We have duplicated values at key {} and key {}".format(key, other_key)

The problem is that when I run the script I got duplicated lines like this:

We have duplicated values at key 5 and key 9
We have duplicated values at key 9 and key 5

So I want to omit the second row and this script also won't show me if I have duplicated values on more than 2 keys for example if I have duplicated values at key 5, 9 and 52 it will show me:

We have duplicated values at key 5 and key 9
We have duplicated values at key 5 and key 52
We have duplicated values at key 9 and key 5
We have duplicated values at key 9 and key 52

And I want to show me that I have duplicated values at key 5, 9 and 52

I also want to show me all the keys with duplicated values so for example I can have duplication at key 5, 9 and 52 and another duplication of the values at key 40 and 65.

You can convert your dictionary from key-> values to a new dictionary value -> keys with associated that value and obtain in such a way the duplicates.

Example:

d = {'a':[1,2],'b':[3,1],'c':[2,1,5]}

values_keys = {}

for key in d.keys():
    for value in d[key]:
        if value not in values_keys:
            values_keys[value] = [key]
        else:
            values_keys[value].append(key)

for key, value in values_keys.items():
    if len(value) > 1:
        print("key {}: We have duplicated values at keys {}".format(key,','.join(map(str, value))))

Both solutions provided do not completely solve the problem. To collect the duplicated values, we need to create an "inverse" dictionary, whose keys are the values of the original dictionary. As @GeorgeStoyanov pointed out that keys are integers and values are lists, we need to convert these lists to tuples to be able to use them as keys of the inverse dict.

from collections import defaultdict

d = {5: ['13139', '3219', '3'], 6: ['14072', '3214', '3'], 7: ['13137', '3219', '3'], 8: ['13141', '3219', '3'],
     9: ['13139', '3219', '3']}

val_to_keys = defaultdict(list)

for k, v in d.items():
    val_to_keys[tuple(v)].append(k)

for collected_keys in val_to_keys.values():
    if len(collected_keys) > 1:
        print(collected_keys)

Output: [9, 5]

If you are using Python 2, you might want to change items() and values() to iteritems() and itervalues() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM