简体   繁体   English

从Python中的列表列表中删除重复项

[英]Removing duplicates from list of lists in Python

Can anyone suggest a good solution to remove duplicates from nested lists if wanting to evaluate duplicates based on first element of each nested list? 如果想要根据每个嵌套列表的第一个元素评估重复项,是否有人可以建议一个很好的解决方案来从嵌套列表中删除重复项?

The main list looks like this: 主要列表如下所示:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]

If there is another list with the same element at first position [k][0] that had already occurred, then I'd like to remove that list and get this result: 如果在第一个位置[k][0]上有另一个具有相同元素的列表已经发生,那么我想删除该列表并获得此结果:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33]]

Can you suggest an algorithm to achieve this goal? 你能建议一种算法来实现这个目标吗?

Do you care about preserving order / which duplicate is removed? 您是否关心保留订单/删除哪些副本? If not, then: 如果没有,那么:

dict((x[0], x) for x in L).values()

will do it. 会做的。 If you want to preserve order, and want to keep the first one you find then: 如果您想保留订单,并希望保留您找到的第一个订单:

def unique_items(L):
    found = set()
    for item in L:
        if item[0] not in found:
            yield item
            found.add(item[0])

print list(unique_items(L))

use a dict instead like so: 改为使用dict,如下所示:

L = {'14': ['65', 76], '2': ['5', 6], '7': ['12', 33]}
L['14'] = ['22', 46]

if you are receiving the first list from some external source, convert it like so: 如果您从某些外部源接收第一个列表,请将其转换为:

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46]]
L_dict = dict((x[0], x[1:]) for x in L)

i am not sure what you meant by "another list", so i assume you are saying those lists inside L 我不确定你的“另一个名单”是什么意思,所以我假设你在L里面说那些名单

a=[]
L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]
for item in L:
    if not item[0] in a:
        a.append(item[0])
        print item

If the order does not matter, code below 如果订单无关紧要,请在下方编写代码

print [ [k] + v for (k, v) in dict( [ [a[0], a[1:]] for a in reversed(L) ] ).items() ]

gives

[['2', '5', '6'], ['14', '65', '76'], ['7', '12', '33']] [['2','5','6'],['14','65','76'],['7','12','33']]

Use Pandas : 使用熊猫:

import pandas as pd

L = [['14', '65', 76], ['2', '5', 6], ['7', '12', 33], ['14', '22', 46],['7','a','b']]

df = pd.DataFrame(L)
df = df.drop_duplicates()

L_no_duplicates = df.values.tolist()

If you want to drop duplicates in specific columns only use instead: 如果要删除特定列中的重复项,请仅使用:

df = df.drop_duplicates([1,2])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM