简体   繁体   English

在Python中删除列表列表中重复项的最快方法?

[英]Fastest way to remove duplicates in list of lists in Python?

I have a list of lists in Python3, where the data looks like this: 我在Python3中有一个列表列表,其中的数据如下所示:

['Type1', ['123', '22'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

The list is quite large, but the above is an example of duplicate data I need to remove. 列表很大,但是上面是我需要删除的重复数据的示例。 Below is an example of data that is NOT duplicated and does not need to be removed: 下面是一个数据示例,这些数据不可重复且不需要删除:

['Type1', ['789', '45'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

I've already removed all the exact identical duplicates. 我已经删除了所有完全相同的重复项。 What is the fastest way to accomplish this "reversed duplicate" removal in Python3? 在Python3中完成“反向重复”删除的最快方法是什么?

Two possibilities: 两种可能性:

  1. Convert each sublist to a tuple and insert into a set. 将每个子列表转换为元组,然后插入到集合中。 Do the same for the compare candidate and compare sets to determine equality. 对比较候选者和比较集执行相同操作以确定相等性。

  2. Establish a sorting method for the sublists, then sort each list of sublists. 为子列表建立排序方法,然后对每个子列表列表进行排序。 This will enable easy comparison. 这将使比较容易。

Both these approaches basically work around your problem of sublist ordering; 这两种方法基本上都可以解决子列表排序的问题。 there are lots of other ways. 还有很多其他方法。

data = [['Type1', ['123', '22'], ['456', '80']],
    ['Type2', ['456', '80'], ['123', '22']]]
myList = []
for i in data:
    myTuple = (i[1], i[2])
    myList.append(myTuple)

print(myList)
for x in myList:
    for y in myList:
        if x==y:
            myList.remove(x)
            break

print(myList)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM