简体   繁体   English

根据多个条件将列表拆分为多个子列表的最快方法

[英]Fastest way to split list into multiple sublists based on several conditions

What is the fastest way to split a list into multiple sublists based on conditions?根据条件将列表拆分为多个子列表的最快方法是什么?

One way to split a listOfObjects into sublists (three sublists for demonstration, but more are possible):listOfObjects拆分为子列表的一种方法(三个子列表用于演示,但可能还有更多):

listOfObjects = [.......]
l1, l2, l3 = [], [], []
for l in listOfObjects:
    if l.someAttribute == "l1":
        l1.append(l)
    elif l.someAttribute == "l2":
        l2.append(l)
    else:
        l3.append(l)

This way does not seem pythonic at all and also takes quite some time.这种方式看起来一点也不像pythonic,也需要相当长的时间。 Are there faster approaches, eg using map ?是否有更快的方法,例如使用map

Similar question , but with only two conditions and no statement about speed.类似的问题,但只有两个条件,没有关于速度的陈述。

You could collections.defaultdict here for mapping.您可以在此处使用collections.defaultdict进行映射。

from collections import defaultdict

d = defaultdict(list)

for l in listOfObjects:
    d[l.someAttribute].append(l)

out = d.values() 
l1 , l2, l3 = d['l1'], d['l2'], d['l3']

d would be of the form. d将是形式。

{ 
  attr1 : [...],
  attr2 : [...],
  ...
  attrn : [...]
}

Omg, that similar question's answer is amazing.天哪,类似问题的答案令人惊叹。 I haven't thought about that for splitting... Anyway, you can do something similar but it would be less readable:我没有考虑过拆分...无论如何,您可以做类似的事情,但可读性会降低:

for l in listOfObjects:
    (l3, l2, l1)[(l.someAttribute == "l1")*2 or l.someAttribute == "l2"].append(l)

This will work for any boolean conditions.这适用于任何 boolean 条件。 or returns first truthy value (or False). or返回第一个真值(或 False)。 True==1 , so we add *2 for the index that we want to be equal to 2. True==1 ,因此我们为希望等于 2 的索引添加 *2。

But as I said, it's not really readable.但正如我所说,它并不是真正可读的。

As for speed: or is short-circuiting, returns first truthy value, so the check of conditions should be similar to your approach.至于速度: or短路,返回第一个真值,因此条件检查应与您的方法相似。 You might want to keep the lookup tuple defined outside of the loop.您可能希望将查找元组定义在循环之外。

But: unless you do something really specialized that requires every single thing to be optimized, it doesn't make a difference.但是:除非你做一些真正专业的事情,需要优化每一件事,否则没有什么区别。


And more readable thing using dict because your conditions are based on equality (note: attribute you want also has to be hashable)使用 dict 更具可读性,因为您的条件基于相等(注意:您想要的属性也必须是可散列的)

lookup = {"l1": l1, "l2": l2}
for l in listOfObjects:
    lookup.get(l.someAttribute, l3).append(l)

dict.get gets default value as second - so it's perfect for our else catchall. dict.get将默认值设为第二个 - 所以它非常适合我们的 else 包罗万象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM