简体   繁体   English

如何在一定数量的非空格和非段落字符后拆分文本?

[英]How to split text after a certain number of non-space and non-paragraph characters?

I would like to split up a text after a certain number of non-space and non-paragraph characters. 我想在一定数量的非空格和非段落字符后面分割文本。

So far, I know that you can do this to split up a string after a total number of characters 到目前为止,我知道您可以执行此操作以在总共字符数之后拆分字符串

cutOff = 10
splitString = oldString[0:cutOff]

But how do I do this so that it does not factor spaces in the character count? 但是我该怎么做,以免在字符计数中不考虑空格?

You can do a while loop. 您可以进行while循环。

oldString = "Hello world"
cutOff = 10

i = 0
while i < cutOff and cutOff < len(oldString):
    if oldString[i] in [' ', '\n']: cutOff += 1
    i += 1

splitString = oldString[:cutOff]

You can use a regular expression. 您可以使用正则表达式。 This returns a two-element tuple (list) containing the two halves of the input string broken at the desired location: 这将返回一个包含两个元素的元组(列表),其中两个字符串在所需的位置处断开:

import re

data = """Now is  the time
for all   good men
to come"""

def break_at_ignoring_whitespace(str, break_at):
    m = re.match(r"((\s*\w){%d})(.*)" % break_at, str, re.S)
    return (m.group(1), m.group(3)) if m else (str, '')

r = break_at_ignoring_whitespace(data, 14)

print(">>" + r[0] + "<<")
print(">>" + r[1] + "<<")

Result: 结果:

>>Now is  the time
fo<<
>>r all   good men
to come<<

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM