[英]Split and flatten a list of Strings and None values using a comprehension
Given a list which contains both strings and None values, in which some of the strings have embedded newlines, I wish to split the strings with newlines into multiple strings and return a flattened list.给定一个包含字符串和 None 值的列表,其中一些字符串嵌入了换行符,我希望将带有换行符的字符串拆分为多个字符串并返回一个扁平列表。
I've written code to do this using a generator function, but the code is rather bulky and I'm wondering if it's possible to do it more concisely using a list comprehension or a function from the itertools module.我已经使用生成器函数编写了代码来执行此操作,但代码相当庞大,我想知道是否可以使用列表理解或来自itertools模块的函数更简洁地完成此操作。
itertools.chain
doesn't seem to be able to decline to iterate any non-iterable elements. itertools.chain
似乎无法拒绝迭代任何不可迭代的元素。
def expand_newlines(lines):
r"""Split strings with newlines into multiple strings.
>>> l = ["1\n2\n3", None, "4\n5\n6"]
>>> list(expand_newlines(l))
['1', '2', '3', None, '4', '5', '6']
"""
for line in lines:
if line is None:
yield line
else:
for l in line.split('\n'):
yield l
You can use yield from
.您可以使用
yield from
。
def expand(lines):
for line in lines:
if isinstance(line,str):
yield from line.split('\n')
elif line is None:
yield line
list(expand(l))
#['1', '2', '3', None, '4', '5', '6']
Here's a single line, but I think @Ch3steR's solution is more readable.这是一行,但我认为@Ch3steR 的解决方案更具可读性。
from itertools import chain
list(chain.from_iterable(i.splitlines() if i is not None and '\n' in i else [i]
for i in lines))
You could use itertools.chain if you did the following如果您执行以下操作,则可以使用 itertools.chain
import itertools
def expand_newlines(lines):
return itertools.chain.from_iterable(x.split("\n") if x else [None]
for x in lines)
Using more_itertools.collapse
to flatten nested lists:使用
more_itertools.collapse
来展平嵌套列表:
Given给定的
import more_itertools as mit
lst = ["1\n2\n3", None, "7\n8\n9"]
Demo演示
list(mit.collapse([x.split("\n") if x else x for x in lst ]))
# ['1', '2', '3', None, '7', '8', '9']
more_itertools
is a third-party package. more_itertools
是第三方包。 Install via > pip install more_itertools
.通过
> pip install more_itertools
。
If you might modify list inplace then you might do:如果您可以就地修改列表,那么您可以这样做:
lst = ["1\n2\n3", None, "4\n5\n6"]
for i in range(len(lst))[::-1]:
if isinstance(lst[i], str):
lst[i:i+1] = lst[i].split("\n")
print(lst) # ['1', '2', '3', None, '4', '5', '6']
this solution utilize fact that you might not only get python's list slices, but also assign to them.这个解决方案利用了这样一个事实,即您不仅可以获得 python 的列表切片,还可以分配给它们。 It moves from right to left, as otherwise I would need to keep count of additional items, which would make it harder.
它从右向左移动,否则我需要计算额外的项目,这会让它变得更难。
Similar to @blueteeth's answer but more concise by way of inverting the logic:类似于@blueteeth 的答案,但通过反转逻辑更简洁:
import itertools
chainfi = itertools.chain.from_iterable
def expand_newlines(lines):
r"""Split strings with newlines into multiple strings.
>>> l = ["1\n2\n3", None, "4\n5\n6"]
>>> list(expand_newlines(l))
['1', '2', '3', None, '4', '5', '6']
"""
return chainfi([None] if l is None else l.split('\n') for l in lines)
None
is the special case so that's what we should be checking for. None
是特例,所以这就是我们应该检查的。
This is concise enough that I wouldn't even bother writing a function for it—I just kept it in the function to confirm it works via doctest.这足够简洁,我什至不会为它编写函数——我只是将它保存在函数中以通过 doctest 确认它可以工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.