简体   繁体   English

从集合 python 中删除第一个重复项并仅保留最后一个唯一

[英]drop first duplicates from set python and keep only last unique

i'm trying to do this with this function我正在尝试用这个 function

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

so i want to check所以我想检查

f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) 

i'm trying to do this and output is我正在尝试这样做,而 output 是

[5, 9, 6, 8, 7]

this keeps only first value.这仅保留第一个值。 but i need keep only last elements.但我只需要保留最后一个元素。

so output should be所以 output 应该是

[5, 7, 8, 6, 9]

This could work:这可以工作:

In [1847]: def f7(seq): 
      ...:     seen = set() 
      ...:     seen_add = seen.add 
      ...:     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 
      ...:                                                                                                                                                                                                  

In [1848]: f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])                                                                                                                                                          
Out[1848]: [5, 7, 8, 6, 9]

You can collect the unique items by creating a set , then order by the index of the reversed sequence to find the last instance of that element, then reverse back to the original order您可以通过创建一个set来收集唯一项目,然后按反向序列的索引排序以找到该元素的最后一个实例,然后反向返回原始顺序

def f7(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

>>> f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])
[5, 7, 8, 6, 9]

Do it in reverse.反过来做。 Here's a rough implementation:这是一个粗略的实现:

l = [5, 5, 9, 6, 8, 7, 7, 8, 6, 9]
def f7(seq):
    seen = set()
    seen_add = seen.add
    return list(reversed([x for x in reversed(seq) if not (x in seen or seen_add(x))]))

print(f7(l))

Output: Output:

[5, 7, 8, 6, 9]

If you want to make this more efficient, you can use descending for loops and/or a collections.deque如果您想提高效率,可以for降序循环和/或collections.deque

You can reverse the list initially and then reverse the answer.您可以先反转列表,然后反转答案。

>>> def f7(seq):
...     seen = set()
...     seen_add = seen.add
...     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1]
...
>>> print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) )
[5, 7, 8, 6, 9]

you can use dict.fromkeys :你可以使用dict.fromkeys

def f7(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]))
# [5, 7, 8, 6, 9]

this will work if your python version is >=3.6 because it is base on the insertion order in dictionaries如果您的 python 版本 >=3.6,这将起作用,因为它基于字典中的插入顺序


here is a simple benchmark with the proposed solutions:这是建议的解决方案的简单基准:

在此处输入图像描述

from simple_benchmark import BenchmarkBuilder
import random
from collections import deque


b = BenchmarkBuilder()

@b.add_function()
def MayankPorwal(seq):
    seen = set() 
    seen_add = seen.add 
    return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 

@b.add_function()
def CoryKramer(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

@b.add_function()
def kederrac(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

@b.add_function()
def LeKhan9(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)

@b.add_arguments('List lenght')
def argument_provider():
    for exp in range(2, 14):
        size = 2**exp
        yield size, [random.randint(0, size) for _ in range(size)]

r = b.run()
r.plot()

In order to avoid reversing twice, you can use a queue data structure.为了避免反转两次,可以使用队列数据结构。 The appends here should run in constant time.这里的追加应该在恒定时间内运行。

from collections import deque

def f7(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)


print f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])

output: [5, 7, 8, 6, 9] output: [5, 7, 8, 6, 9]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM