简体   繁体   English

Python:使用嵌套的理解将字符串列表按长度划分为字符串列表

[英]Python: split list of strings to a list of lists of strings by length with a nested comprehensions

I've got a list of strings and I'm trying to make a list of lists of strings by string length. 我有一个字符串列表,我想按字符串长度列出一个字符串列表。

ie

['a', 'b', 'ab', 'abc'] 

becomes 变成

[['a', 'b'], ['ab'], ['abc']]

I've accomplished this like so: 我已经这样完成了:

lst = ['a', 'b', 'ab', 'abc']
lsts = []
for num in set(len(i) for i in lst):
    lsts.append([w for w in lst if len(w) == num])

I'm fine with that code, but I'm trying to wrap my head around comprehensions. 我对那个代码很好,但是我想把我的头放在理解上。 I want to use nested comprehensions to do the same thing, but I can't figure out how. 我想使用嵌套的理解来做同样的事情,但是我不知道怎么做。

>>> [[w for w in L if len(w) == num] for num in set(len(i) for i in L)]
[['a', 'b'], ['ab'], ['abc']]

Also, itertools is likely to be a little more efficient. 同样, itertools可能会更有效率。

lst = ['a', 'b', 'ab', 'abc']
lst.sort(key=len) # does not make any change on this data,but
                  # all strings of given length must occur together


from itertools import groupby
lst = [list(grp) for i,grp in groupby(lst, key=len)]

results in 结果是

[['a', 'b'], ['ab'], ['abc']]

That is for all lengths from 1 to maximum (some of lists will be empty if there are no strings of that length in the a list): 这是从1到最大长度的所有长度(如果a列表中没有该长度的字符串,则某些列表将为空):

>>> a = ['a', 'b', 'ab', 'abc']
>>> m = max(len(x) for x in a)
>>> print [[x for x in a if len(x) == i + 1] for i in range(m)]
[['a', 'b'], ['ab'], ['abc']]

But if you want to have only lists for the lengths that are in a you must use set(len(i) for i in lst) instead of range. 但是,如果只希望列出长度为a列表,则必须使用set(len(i) for i in lst)而不是范围。

>>> print [[x for x in a if len(x) == i] for i in set(len(k) for k in a)]
[['a', 'b'], ['ab'], ['abc']]

There is no difference for the list ['a', 'b', 'ab', 'abc'] . 列表['a', 'b', 'ab', 'abc']没有区别。 But if you change it a little bit, eg so: [['a', 'b'], ['ab'], ['abcd']] , you will see the difference: 但是,如果您稍作更改,例如: [['a', 'b'], ['ab'], ['abcd']] ,您将看到区别:

>>> a = ['a', 'b', 'ab', 'abcd']
>>> print [[x for x in a if len(x) == i] for i in set(len(k) for k in a)]
[['a', 'b'], ['ab'], ['abcd']]

>>> print [[x for x in a if len(x) == i + 1] for i in range(max(len(x) for x in a))]
[['a', 'b'], ['ab'], [], ['abcd']]
L=['a','b','ab','abc']
result = [ [ w for w in L if len(w) == n] for n in set(len(i) for i in L)]
from itertools import groupby

mylist = ['a', 'b', 'ab', 'abc']
[list(vals) for key, vals in groupby(mylist, lambda L: len(L))]

note that since groupby only works on adjacent elements - you may need to force a sort on mylist with key=len) 请注意,由于groupby仅适用于相邻元素-您可能需要使用key = len在mylist上强制排序)

  • returns an iterator with key (which will be length) and vals which is another iterator containing data in that key group. 返回带有键(将为length)和vals的迭代器,而vals是另一个包含该键组中数据的迭代器。
  • then converts the iterator of data into a list 然后将数据的迭代器转换为列表
  • the outside list becomes built from the above 外部列表从上面构建

  • - --

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM