繁体   English   中英

使用每个条目实例变量删除列表中重复项的最快方法

[英]Fastest way to remove duplicates in a list using each entries instance variables

TL;博士

有没有更快的方法

listOfClasses = [fooA, fooB, fooC]
setOfStrings = {c.string for c in listOfClasses}
newListOfClasses = []
for c in listOfClasses:
  if c.string in setOfStrings:
    newListOfClasses.append(c)
    setOfStrings.remove(c.string)

限制/警告:

  • listOfClasses

    • 条目不一定是唯一的,可能包含重复的类
    • 条目的数量通常在 3-4 的范围内,但可以多达 ~20
  • newListOfClasses

    • 不能创建新类,它必须使用相同的对象(这些对象不是我的,我不知道它们是如何初始化的,所以我不知道它们里面除了字符串之外是什么)。
    • 结果必须至少是可迭代的
    • 结果顺序无关紧要
    • 可以覆盖原来listOfClasses

假设我有一个 class

class Foo(object):
  def __init__(self,string):
    self.string = string

以及我要删除所有具有重复“字符串”实例变量的类的类列表

fooA = Foo("alice")
fooB = Foo("alice")
fooC = Foo("His Royal Highness The Prince Philip, Duke of Edinburgh, Earl of Merioneth, Baron Greenwich, Royal Knight of the Most Noble Order of the Garter, Extra Knight of the Most Ancient and Most Noble Order of the Thistle, Member of the Order of Merit, Grand Master and First and Principal Knight Grand Cross of the Most Excellent Order of the British Empire, Knight of the Order of Australia, Additional Member of the Order of New Zealand, Extra Companion of the Queen’s Service Order, Royal Chief of the Order of Logohu, Extraordinary Companion of the Order of Canada, Extraordinary Commander of the Order of Military Merit, Lord of Her Majesty’s Most Honourable Privy Council, Privy Councillor of the Queen’s Privy Council for Canada, Personal Aide-de-Camp to Her Majesty, Lord High Admiral of the United Kingdom.")

listOfClasses = [fooA, fooB, fooC]

在这里,我想删除fooAfooB (哪个都没有关系),这样我就只剩下

listOfClasses = [fooB, fooC] # for example

到目前为止,我有以下内容:

setOfStrings = {c.string for c in listOfClasses}
newListOfClasses = []
for c in listOfClasses:
  if c.string in setOfStrings:
    newListOfClasses.append(c)
    setOfStrings.remove(c.string)

对于上述情况,我得到以下时间:

# len(listOfClasses) = 3
2.22 ms ± 24.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# len(listOfClasses) = 20
2.29 ms ± 119 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

在字典理解中使用字典的唯一键应该非常快:

list({cls.string: cls for cls in listOfClasses}.values())

完整示例:

class Foo(object):
    def __init__(self, string):
        self.string = string

fooA = Foo("alice")
fooB = Foo("alice")
fooC = Foo("His Royal Highness")

listOfClasses = [fooA, fooB, fooC]

print(list({cls.string: cls for cls in listOfClasses}.values()))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM