簡體   English   中英

從集合 python 中刪除第一個重復項並僅保留最后一個唯一

[英]drop first duplicates from set python and keep only last unique

我正在嘗試用這個 function

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

所以我想檢查

f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) 

我正在嘗試這樣做,而 output 是

[5, 9, 6, 8, 7]

這僅保留第一個值。 但我只需要保留最后一個元素。

所以 output 應該是

[5, 7, 8, 6, 9]

這可以工作:

In [1847]: def f7(seq): 
      ...:     seen = set() 
      ...:     seen_add = seen.add 
      ...:     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 
      ...:                                                                                                                                                                                                  

In [1848]: f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])                                                                                                                                                          
Out[1848]: [5, 7, 8, 6, 9]

您可以通過創建一個set來收集唯一項目,然后按反向序列的索引排序以找到該元素的最后一個實例,然后反向返回原始順序

def f7(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

>>> f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])
[5, 7, 8, 6, 9]

反過來做。 這是一個粗略的實現:

l = [5, 5, 9, 6, 8, 7, 7, 8, 6, 9]
def f7(seq):
    seen = set()
    seen_add = seen.add
    return list(reversed([x for x in reversed(seq) if not (x in seen or seen_add(x))]))

print(f7(l))

Output:

[5, 7, 8, 6, 9]

如果您想提高效率,可以for降序循環和/或collections.deque

您可以先反轉列表,然后反轉答案。

>>> def f7(seq):
...     seen = set()
...     seen_add = seen.add
...     return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1]
...
>>> print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]) )
[5, 7, 8, 6, 9]

你可以使用dict.fromkeys

def f7(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

print(f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9]))
# [5, 7, 8, 6, 9]

如果您的 python 版本 >=3.6,這將起作用,因為它基於字典中的插入順序


這是建議的解決方案的簡單基准:

在此處輸入圖像描述

from simple_benchmark import BenchmarkBuilder
import random
from collections import deque


b = BenchmarkBuilder()

@b.add_function()
def MayankPorwal(seq):
    seen = set() 
    seen_add = seen.add 
    return [x for x in seq[::-1] if not (x in seen or seen_add(x))][::-1] 

@b.add_function()
def CoryKramer(seq):
    return sorted(set(seq), key=lambda i: seq[::-1].index(i), reverse=True)

@b.add_function()
def kederrac(seq):
    return list(dict.fromkeys(seq[::-1]))[::-1]

@b.add_function()
def LeKhan9(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)

@b.add_arguments('List lenght')
def argument_provider():
    for exp in range(2, 14):
        size = 2**exp
        yield size, [random.randint(0, size) for _ in range(size)]

r = b.run()
r.plot()

為了避免反轉兩次,可以使用隊列數據結構。 這里的追加應該在恆定時間內運行。

from collections import deque

def f7(seq):
    q = deque()
    seen = set()
    seen_add = seen.add
    for x in reversed(seq):
        if not (x in seen or seen_add(x)):
            q.appendleft(x)

    return list(q)


print f7([5, 5, 9, 6, 8, 7, 7, 8, 6, 9])

output: [5, 7, 8, 6, 9]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM