简体   繁体   English

Python比较列表列表

[英]Python comparing a list of lists

I have a list of lists in this format: 我有这种格式的列表列表:

[[<image object1>, source1 , version1],[<image object2>, source2 , version2]...]

I need to compare each list and construct a new list of lists that contains unique source values. 我需要比较每个列表并构建一个包含唯一源值的列表的新列表。 When there are duplicated source values, I need to pick the list with the highest version value. 当存在重复的源值时,我需要选择具有最高版本值的列表。

Also, is this the proper data structure I should use? 另外,这是我应该使用的正确数据结构吗?

You can use itertools.groupby and the max function for that: 您可以使用itertools.groupbymax函数:

>>> lst = [['foo', 1, 2], ['asdf', 2, 5], ['bar', 1, 3]]
>>> import itertools as it
>>> from operator import itemgetter
>>> [max(items, key=itemgetter(2)) 
     for _,items in it.groupby(sorted(lst, key=itemgetter(1)), key=itemgetter(1))]
[['bar', 1, 3], ['asdf', 2, 5]]

Assuming that all of your sublists have that same three item structure, that seems like a fairly sensible data structure to use, since you can always access the image object, source and version with indexes [0], [1] and [2]. 假设所有子列表都具有相同的三项结构,这似乎是一个相当合理的数据结构,因为您始终可以使用索引[0],[1]和[2]访问图像对象,源和版本。

This code makes the sources the keys of a dictionary, and the sublists the values of those keys. 此代码使源成为字典的键,并将这些键的值列入子列表。

bigList = [['foo', 1, 2], ['asdf', 2, 5], ['bar', 1, 3]]
uniqueSources = {}
for sublist in bigList:
    currentSource = sublist[1]
    if currentSource in uniqueSources:
        if sublist[2] > uniqueSources[currentSource][2]:
            uniqueSources[currentSource] = sublist
    else: uniqueSources[currentSource] = sublist
dupesRemoved = list(uniqueSources.values())
print(dupesRemoved)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM