[英]How to split text after a certain number of non-space and non-paragraph characters?
I would like to split up a text after a certain number of non-space and non-paragraph characters. 我想在一定数量的非空格和非段落字符后面分割文本。
So far, I know that you can do this to split up a string after a total number of characters 到目前为止,我知道您可以执行此操作以在总共字符数之后拆分字符串
cutOff = 10
splitString = oldString[0:cutOff]
But how do I do this so that it does not factor spaces in the character count? 但是我该怎么做,以免在字符计数中不考虑空格?
You can do a while
loop. 您可以进行
while
循环。
oldString = "Hello world"
cutOff = 10
i = 0
while i < cutOff and cutOff < len(oldString):
if oldString[i] in [' ', '\n']: cutOff += 1
i += 1
splitString = oldString[:cutOff]
You can use a regular expression. 您可以使用正则表达式。 This returns a two-element tuple (list) containing the two halves of the input string broken at the desired location:
这将返回一个包含两个元素的元组(列表),其中两个字符串在所需的位置处断开:
import re
data = """Now is the time
for all good men
to come"""
def break_at_ignoring_whitespace(str, break_at):
m = re.match(r"((\s*\w){%d})(.*)" % break_at, str, re.S)
return (m.group(1), m.group(3)) if m else (str, '')
r = break_at_ignoring_whitespace(data, 14)
print(">>" + r[0] + "<<")
print(">>" + r[1] + "<<")
Result: 结果:
>>Now is the time
fo<<
>>r all good men
to come<<
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.