简体   繁体   English

在Python中删除字典值中的重复项

[英]Removing duplicates in values of dictionary in python

Sorry the topic's title is vague, I find it hard to explain. 抱歉,该主题的标题含糊,我觉得很难解释。

I have a dictionary in which each value is a list of items. 我有一本字典,其中每个值都是项目列表。 I wish to remove the duplicated items, so that each item will appear minimum times (preferable once) in the lists. 我希望删除重复的项目,以便每个项目在列表中显示最少的时间(最好一次)。

Consider the dictionary: 考虑一下字典:

example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]}

'weapon2' and 'weapon3' have the same values, so it should result in: “ weapon2”和“ weapon3”具有相同的值,因此应导致:

result_dictionary = {"weapon1":[1],"weapon2":[3],"weapon3":[2]}

since I don't mind the order, it can also result in: 由于我不介意订购,因此也会导致:

result_dictionary = {"weapon1":[1],"weapon2":[2],"weapon3":[3]}

But when "there's no choice" it should leave the value. 但是,当“别无选择”时,它应该留下价值。 Consider this new dictionary: 考虑一下这个新字典:

example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3],"weapon4":[3]}

now, since it cannot assign either '2' or '3' only once without leaving a key empty, a possible output would be: 现在,由于不能在不将键保留为空的情况下仅分配一次“ 2”或“ 3”,因此可能的输出为:

result_dictionary = {"weapon1":[1],"weapon2":[3],"weapon3":[2],"weapon4":[3]}

I can relax the problem to only the first part and manage, though I prefer a solution to the two parts together 我可以仅将问题放到第一部分并进行管理,尽管我更喜欢将两个部分放在一起解决

#!/usr/bin/env python3

example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]}

result = {}
used_values = []

def extract_semi_unique_value(my_list):
    for val in my_list:
        if val not in used_values:
            used_values.append(val)
            return val
    return my_list[0]

for key, value in example_dictionary.items():
    semi_unique_value = extract_semi_unique_value(value)
    result[key] = [semi_unique_value]

print(result)

This is probably not the most efficient solution possible. 这可能不是最有效的解决方案。 Because it involves iteration over all possible combinations, then it'll run quite slow for large targets. 因为它涉及所有可能组合的迭代,所以对于大型目标而言,它将运行得非常慢。

It makes use of itertools.product() to get all possible combinations. 它利用itertools.product()获得所有可能的组合。 Then in it, tries to find the combination with the most unique numbers (by testing the length of a set). 然后在其中尝试查找具有最唯一编号的组合(通过测试一组的长度)。

from itertools import product
def dedup(weapons):
    # get the keys and values ordered so we can join them back
    #  up again at the end
    keys, vals = zip(*weapons.items())

    # because sets remove all duplicates, whichever combo has
    #  the longest set is the most unique
    best = max(product(*vals), key=lambda combo: len(set(combo)))

    # combine the keys and whatever we found was the best combo
    return {k: [v] for k, v in zip(keys, best)}

From the examples: 从示例中:

dedup({"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]})
#: {'weapon1': 1, 'weapon2': 2, 'weapon3': 3}
dedup({"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3],"weapon4":[3]})
#: {'weapon1': 1, 'weapon2': 2, 'weapon3': 2, 'weapon4': 3}

this could help 这可能会有所帮助

import itertools
res = {'weapon1': [1, 2, 3], 'weapon2': [2, 3], 'weapon3': [2, 3]}
r = [[x] for x in list(set(list(itertools.chain.from_iterable(res.values()))))]
r2 = [x for x in res.keys()]
r3 = list(itertools.product(r2,r))
r4 = dict([r3[x] for x in range(0,len(r3)) if not x%4])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM