简体   繁体   English

Python 根据值从字典列表中删除重复项

[英]Python Remove duplicates from list of dictionaries based on a value

I have list of dictionaries我有字典列表

vals = [
         {'tmpl_id': 67,  'qty_available': -3.0, 'product_id': 72, 'product_qty': 1.0},     
         {'tmpl_id': 67,  'qty_available': 5.0, 'product_id': 71, 'product_qty': 1.0}
         {'tmpl_id': 69,  'qty_available': 10.0, 'product_id': 74, 'product_qty': 1.0}
       ]

from operator import itemgetter
getvals = operator.itemgetter('tmpl_id')

val.sort(key=getvals)

result = []

for k, g in itertools.groupby(val, getvals):

    result.append(g.next())

val[:] = result

I want to remove duplicate values (tmpl_id) and also based on qty_available is lesser or negative我想删除重复值 (tmpl_id) 并且还基于 qty_available 较小或为负

Output will be like:输出将类似于:

vals = [
          {'tmpl_id': 67,  'qty_available': 5.0, 'product_id': 71, 'product_qty': 1.0}
          {'tmpl_id': 69,  'qty_available': 10.0, 'product_id': 74, 'product_qty': 1.0}
       ]
from collections import Counter

vals = [{'tmpl_id': 67,  'qty_available': -3.0, 'product_id': 72, 'product_qty': 1.0},
        {'tmpl_id': 67,  'qty_available': 5.0, 'product_id': 71, 'product_qty': 1.0},
        {'tmpl_id': 69,  'qty_available': 10.0, 'product_id': 74, 'product_qty': 1.0},]

k = [x['tmpl_id'] for x in vals]

new_vals=[]

for i in Counter(k):
    all = [x for x in vals if x['tmpl_id']==i]
    new_vals.append(max(all, key=lambda x: x['qty_available']))

>>> new_vals
[
    {'product_qty': 1.0, 'qty_available': 5.0, 'tmpl_id': 67, 'product_id': 71}, 
    {'product_qty': 1.0, 'qty_available': 10.0, 'tmpl_id': 69, 'product_id': 74}
]

You can store the dicts using the value from "tmpl_id" as the key setting the dict as the value, if you get a dict with a higher 'qty_available' then you replace with the current dict :您可以使用"tmpl_id"的值作为键来存储"tmpl_id" ,将 dict 设置为值,如果您得到一个'qty_available'更高的 dict,那么您将替换为当前的 dict :

def remove_dupes(l, k, k2):
    seen = {} 
    for d in vals:
        v, v2 = d[k], d[k2]
        if v not in seen:
            seen[v] = d
        elif v2 > seen[v][k2]:
            seen[v] = d
    return seen

vals[:] = remove_dupes(vals, "tmpl_id",'qty_available' ).values()

Output:输出:

[{'product_id': 71, 'qty_available': 5.0, 'tmpl_id': 67, 'product_qty': 1.0}, 
{'product_id': 74, 'qty_available': 10.0, 'tmpl_id': 69, 'product_qty': 1.0}]

if you were to use sorted and groupby, you just need sort in reverse and get the first value from each v :如果您要使用 sorted 和 groupby,您只需要反向排序并从每个 v 中获取第一个值:

from itertools import groupby
from operator import itemgetter

keys = itemgetter("tmpl_id",'qty_available')

vals[:] = (next(v) for k,v in groupby(sorted(vals, key=keys,reverse=True), 
                 key=itemgetter("tmpl_id")))

print(vals)

reversing the sort will mean the higher 'qty_available' will come first so for unique dicts it will just give you that dict, for repeated tmpl_id's you will get the one with the largest value for qty_available'`.反转排序意味着较高的'qty_available'将首先出现,因此对于唯一的字典,它只会为您提供该字典,对于重复的 tmpl_id,您将获得具有最大'qty_available'值的那个。

If you want an inplace sort instead of creating a new list just use vals.sort() and remove the call to sorted如果您想要就地排序而不是创建新列表,只需使用vals.sort()并删除对 sorted 的调用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM