简体   繁体   English

从字典列表中删除重复项 python

[英]Remove duplicates key from list of dictionaries python

I am trying to remove the duplicates from following list我正在尝试从以下列表中删除重复项

 distinct_cur = [{'rtc': 0, 'vf': 0, 'mtc': 0, 'doc': 'good job', 'foc': 195, 'st': 0.0, 'htc': 2, '_id': ObjectId('58e86a550a0aeff4e14ca6bb'), 'ftc': 0}, 
 {'rtc': 0, 'vf': 0, 'mtc': 0, 'doc': 'good job', 'foc': 454, 'st': 0.8, 'htc': 1, '_id': ObjectId('58e8d03958ae6d179c2b4413'), 'ftc': 1},
 {'rtc': 0, 'vf': 2, 'mtc': 1, 'doc': 'test', 'foc': 45, 'st': 0.8, 'htc': 12, '_id': ObjectId('58e8d03958ae6d180c2b4446'), 'ftc': 0}] 

of dictionaries based on condition that if 'doc' key value text is same then one of the dictionary should be removed.字典基于条件,如果 'doc' 键值文本相同,则应删除其中一个字典。 I have tried the following solution我尝试了以下解决方案

distinct_cur = [dict(y) for y in set(tuple(x.items()) for x in cur)] 

but duplicates are still present in the final list.但最终列表中仍然存在重复项。

Below is the desired output as in 1st and 2nd distinct_cur text of key 'doc' value is same (good job):以下是所需的输出,因为键 'doc' 值的第一个和第二个 distinct_cur 文本相同(干得好):

[{'rtc': 0, 'vf': 0, 'mtc': 0, 'doc': 'good job', 'foc': 195, 'st': 0.0, 'htc': 2, '_id': ObjectId('58e86a550a0aeff4e14ca6bb'), 'ftc': 0}, 
 {'rtc': 0, 'vf': 2, 'mtc': 1, 'doc': 'test', 'foc': 45, 'st': 0.8, 'htc': 12, '_id': ObjectId('58e8d03958ae6d180c2b4446'), 'ftc': 0}] 

Thanks in advance!提前致谢!

You're creating a set out of different elements and expect that it will remove the duplicates based on a criterion that only you know.您正在创建一个由不同元素组成的set ,并期望它会根据只有您知道的标准删除重复项。

You have to iterate through your list, and add to the result list only if doc has a different value than the previous ones: for instance like this:您必须遍历您的列表,并且仅当doc的值与之前的值不同时才将其添加到结果列表中:例如像这样:

done = set()
result = []
for d in distinct_cur:
    if d['doc'] not in done:
        done.add(d['doc'])  # note it down for further iterations
        result.append(d)

that will keep only the first occurrence(s) of the dictionaries which have the same doc key by registering the known keys in an aux set.通过在辅助集中注册已知键,这将仅保留具有相同doc键的词典的第一次出现。

Another possibility is to use a dictionary with the key as the "doc" key of the dictionary, iterating backwards in the list so the first items overwrite the last ones in the list:另一种可能性是使用带有键的字典作为字典的"doc"键,在列表中向后迭代,以便第一项覆盖列表中的最后一项:

result = {i['doc']:i for i in reversed(distinct_cur)}.values()

I see 2 similar solutions that depend on your domain problem: do you want to keep the first instance of a key or the last instance?我看到2个依赖于你的域的问题类似的解决方案:你想保持一个键或最后一个实例的第一个实例?

Using the last (so as to overwrite the previous matches) is simpler:使用最后一个(以便覆盖之前的匹配项)更简单:

d = {r['doc']: r for r in distinct_cur}.values()

一个用于对doc的 primary_key 上的字典distinct_cur列表进行重复数据删除的衬垫

[i for n, i in enumerate(distinct_cur) if i.get('doc') not in [y.get('doc') for y in distinct_cur[n + 1:]]]

Try this:尝试这个:

distinct_cur  =[dict(t) for t in set([tuple(d.items()) for d in distinct_cur])]

Worked for me...为我工作...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM