简体   繁体   中英

Fastest way to remove duplicates in list of lists in Python?

I have a list of lists in Python3, where the data looks like this:

['Type1', ['123', '22'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

The list is quite large, but the above is an example of duplicate data I need to remove. Below is an example of data that is NOT duplicated and does not need to be removed:

['Type1', ['789', '45'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

I've already removed all the exact identical duplicates. What is the fastest way to accomplish this "reversed duplicate" removal in Python3?

Two possibilities:

  1. Convert each sublist to a tuple and insert into a set. Do the same for the compare candidate and compare sets to determine equality.

  2. Establish a sorting method for the sublists, then sort each list of sublists. This will enable easy comparison.

Both these approaches basically work around your problem of sublist ordering; there are lots of other ways.

data = [['Type1', ['123', '22'], ['456', '80']],
    ['Type2', ['456', '80'], ['123', '22']]]
myList = []
for i in data:
    myTuple = (i[1], i[2])
    myList.append(myTuple)

print(myList)
for x in myList:
    for y in myList:
        if x==y:
            myList.remove(x)
            break

print(myList)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM