简体   繁体   English

如何在python中将多行字符串拆分为多行?

[英]How do I split a multi-line string into multiple lines in python?

I have a multi-line string: 我有一个多行字符串:

inputString = "Line 1\nLine 2\nLine 3"

I want to have an array, each element will have maximum 2 lines it it as below: 我想要一个数组,每个元素最多包含2行,如下所示:

outputStringList = ["Line 1\nLine2", "Line3"]

Can i convert inputString to outputStringList in python. 我可以在python中将inputString转换为outputStringList吗? Any help will be appreciated. 任何帮助将不胜感激。

you could try to find 2 lines (with lookahead inside it to avoid capturing the linefeed) or only one (to process the last, odd line). 您可以尝试查找2行(在其中行内具有超前效果以避免捕获换行)或仅找到1行(以处理最后的奇数行)。 I expanded your example to show that it works for more than 3 lines (with a little "cheat": adding a newline in the end to handle all cases: 我扩展了您的示例,以显示它适用于3行以上(带有一些“作弊”):最后添加换行符来处理所有情况:

import re

s = "Line 1\nLine 2\nLine 3\nline4\nline5"
result = re.findall(r'(.+?\n.+?(?=\n)|.+)', s+"\n")

print(result)

result: 结果:

['Line 1\nLine 2', 'Line 3\nline4', 'line5']

the "add newline cheat" allows to process that properly: “添加换行符”可以正确处理:

    s = "Line 1\nLine 2\nLine 3\nline4\nline5\nline6"

result: 结果:

['Line 1\nLine 2', 'Line 3\nline4', 'line5\nline6']

Here is an alternative using the grouper itertools recipe to group any number of lines together. 下面是使用一种替代grouper itertools配方到组中的任何数目的行在一起。

Note: you can implement this recipe by hand, or you can optionally install a third-party library that implements this recipe for you, ie pip install more_itertools . 注意:您可以手动实现此配方,也可以选择安装为您实现此配方的第三方库,即pip install more_itertools

Code

from more_itertools import grouper


def group_lines(iterable, n=2):
    return ["\n".join((line for line in lines if line))
                    for lines in grouper(n, iterable.split("\n"), fillvalue="")]

Demo 演示

s1 = "Line 1\nLine 2\nLine 3"
s2 = "Line 1\nLine 2\nLine 3\nLine4\nLine5"


group_lines(s1)
# ['Line 1\nLine 2', 'Line 3']

group_lines(s2)
# ['Line 1\nLine 2', 'Line 3\nLine4', 'Line5']

group_lines(s2, n=3)
# ['Line 1\nLine 2\nLine 3', 'Line4\nLine5']

Details 细节

group_lines() splits the string into lines and then groups the lines by n via grouper . group_lines()拆分串入线,然后组由线n经由grouper

list(grouper(2, s1.split("\n"), fillvalue=""))
[('Line 1', 'Line 2'), ('Line 3', '')]

Finally, for each group of lines, only non-emptry strings are rejoined with a newline character. 最后,对于每组线,仅将非空字符串与换行符重新连接。

See more_itertools docs for more details on grouper . more_itertools文档详细掌握grouper

I'm hoping I get your logic right - If you want a list of string, each with at most one newline delimiter, then the following code snippet will work: 我希望我的逻辑正确-如果您想要一个字符串列表,每个字符串最多具有一个换行符分隔符,那么以下代码片段将起作用:

# Newline-delimited string
a = "Line 1\nLine 2\nLine 3\nLine 4\nLine 5\nLine 6\nLine 7"
# Resulting list
b = []

# First split the string into "1-line-long" pieces
a = a.split("\n")

for i in range(1, len(a), 2):

    # Then join the pieces by 2's and append to the resulting list
    b.append(a[i - 1] + "\n" + a[i]) 

    # Account for the possibility of an odd-sized list
    if i == len(a) - 2: 
        b.append(a[i + 1])

print(b)

>>> ['Line 1\nLine 2', 'Line 3\nLine 4', 'Line 5\nLine 6', 'Line 7']

Although this solution isn't the fastest nor the best, it's easy to understand and it does not involve extra libraries. 尽管此解决方案既不是最快也不是最好的解决方案,但它易于理解并且不涉及额外的库。

I wanted to post the grouper recipe from the itertools docs as well, but PyToolz' partition_all is actually a bit nicer. 我也想从itertools文档中发布石斑鱼食谱,但是PyToolz的partition_all实际上更好一些。

from toolz import partition_all

s = "Line 1\nLine 2\nLine 3\nLine 4\nLine 5"
result = ['\n'.join(tup) for tup in partition_all(2, s.splitlines())]
# ['Line 1\nLine 2', 'Line 3\nLine 4', 'Line 5']

Here's the grouper solution for the sake of completeness: 为了完整起见,这是grouper解决方案:

from itertools import zip_longest

# Recipe from the itertools docs.
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

result = ['\n'.join((a, b)) if b else a for a, b in grouper(s, 2)]

Use str.splitlines() to split the full input into lines: 使用str.splitlines()将完整的输入分成几行:

>>> inputString = "Line 1\nLine 2\nLine 3"
>>> outputStringList = inputString.splitlines()
>>> print(outputStringList)
['Line 1', 'Line 2', 'Line 3']

Then, join the first lines to obtain the desired result: 然后,加入第一行以获得所需的结果:

>>> result = ['\n'.join(outputStringList[:-1])] + outputStringList[-1:]
>>> print(result)
['Line 1\nLine 2', 'Line 3']

Bonus: write a function that do the same, for any number of desired lines: 奖励:编写一个可以对任意数量的所需行执行相同操作的函数:

def split_to_max_lines(inputStr, n):
    lines = inputStr.splitlines()
    # This define which element  in the list become the 2nd in the
    # final result. For n = 2, index = -1, for n = 4, index = -3, etc.
    split_index = -(n - 1)
    result = ['\n'.join(lines[:split_index])]
    result += lines[split_index:]
    return result

print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 2))
print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 4))
print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 5))

Returns: 返回:

['Line 1\nLine 2\nLine 3\nline 4\nLine 5', 'Line 6']
['Line 1\nLine 2\nLine 3', 'line 4', 'Line 5', 'Line 6']
['Line 1\nLine 2', 'Line 3', 'line 4', 'Line 5', 'Line 6']
b = "a\nb\nc\nd".split("\n", 3)
c = ["\n".join(b[:-1]), b[-1]]
print c

gives

['a\nb\nc', 'd']

I'm not sure what you mean by "a maximum of 2 lines" and how you'd hope to achieve that. 我不确定“最多2行”是什么意思,以及您希望如何实现。 However, splitting on newlines is fairly simple. 但是,在换行符上分割非常简单。

'Line 1\nLine 2\nLine 3'.split('\n')

This will result in: 这将导致:

['line 1', 'line 2', 'line 3']

To get the weird allowance for "some" line splitting, you'll have to write your own logic for that. 要获得“某些”行拆分的怪异津贴,您必须为此编写自己的逻辑。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM