簡體   English   中英

使用每個條目實例變量刪除列表中重復項的最快方法

[英]Fastest way to remove duplicates in a list using each entries instance variables

TL;博士

有沒有更快的方法

listOfClasses = [fooA, fooB, fooC]
setOfStrings = {c.string for c in listOfClasses}
newListOfClasses = []
for c in listOfClasses:
  if c.string in setOfStrings:
    newListOfClasses.append(c)
    setOfStrings.remove(c.string)

限制/警告:

  • listOfClasses

    • 條目不一定是唯一的,可能包含重復的類
    • 條目的數量通常在 3-4 的范圍內,但可以多達 ~20
  • newListOfClasses

    • 不能創建新類,它必須使用相同的對象(這些對象不是我的,我不知道它們是如何初始化的,所以我不知道它們里面除了字符串之外是什么)。
    • 結果必須至少是可迭代的
    • 結果順序無關緊要
    • 可以覆蓋原來listOfClasses

假設我有一個 class

class Foo(object):
  def __init__(self,string):
    self.string = string

以及我要刪除所有具有重復“字符串”實例變量的類的類列表

fooA = Foo("alice")
fooB = Foo("alice")
fooC = Foo("His Royal Highness The Prince Philip, Duke of Edinburgh, Earl of Merioneth, Baron Greenwich, Royal Knight of the Most Noble Order of the Garter, Extra Knight of the Most Ancient and Most Noble Order of the Thistle, Member of the Order of Merit, Grand Master and First and Principal Knight Grand Cross of the Most Excellent Order of the British Empire, Knight of the Order of Australia, Additional Member of the Order of New Zealand, Extra Companion of the Queen’s Service Order, Royal Chief of the Order of Logohu, Extraordinary Companion of the Order of Canada, Extraordinary Commander of the Order of Military Merit, Lord of Her Majesty’s Most Honourable Privy Council, Privy Councillor of the Queen’s Privy Council for Canada, Personal Aide-de-Camp to Her Majesty, Lord High Admiral of the United Kingdom.")

listOfClasses = [fooA, fooB, fooC]

在這里,我想刪除fooAfooB (哪個都沒有關系),這樣我就只剩下

listOfClasses = [fooB, fooC] # for example

到目前為止,我有以下內容:

setOfStrings = {c.string for c in listOfClasses}
newListOfClasses = []
for c in listOfClasses:
  if c.string in setOfStrings:
    newListOfClasses.append(c)
    setOfStrings.remove(c.string)

對於上述情況,我得到以下時間:

# len(listOfClasses) = 3
2.22 ms ± 24.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# len(listOfClasses) = 20
2.29 ms ± 119 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

在字典理解中使用字典的唯一鍵應該非常快:

list({cls.string: cls for cls in listOfClasses}.values())

完整示例:

class Foo(object):
    def __init__(self, string):
        self.string = string

fooA = Foo("alice")
fooB = Foo("alice")
fooC = Foo("His Royal Highness")

listOfClasses = [fooA, fooB, fooC]

print(list({cls.string: cls for cls in listOfClasses}.values()))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM