简体   繁体   English

Python 字符串拆分模式而不删除分隔符

[英]Python String Split on pattern without removing delimiter

I have a long string, and I want to break it into smaller stinger whenever a certain pattern showed up: (in below case 123 my)我有一个很长的字符串,每当出现某种模式时,我想把它分成更小的毒刺:(在下面的情况下为 123 我)

my_str = '123 my string is long 123 my string is very long 123 my string is so long'

I want the result to be:我希望结果是:

result = ['123 my string is long ', '123 my string is very long ', '123 my string is so long ']

Length of string is unknown.字符串长度未知。 and I don't want to remove anything from the main string.我不想从主字符串中删除任何内容。

You can also use a look ahead regex:您还可以使用前瞻正则表达式:

import re
re.split(r'.(?=123 my)', my_str)
=>
['123 my string is long',
 '123 my string is very long',
 '123 my string is so long']

You can split on the delimiter and then add it back in with a list comprehension:您可以拆分分隔符,然后使用列表理解将其重新添加:

my_str = '123 my string is long 123 my string is very long 123 my string is so long'
delimiter = '123 my'
result = ['{}{}'.format(delimiter, s) for s in my_str.split(delimiter) if s]
print(result)

Output输出

['123 my string is long ', '123 my string is very long ', '123 my string is so long']

I don't know where the trailing space in the last list item comes from in your desired output, it's not in the original string and so should be absent in the result.我不知道最后一个列表项中的尾随空格在您想要的输出中来自哪里,它不在原始字符串中,因此应该在结果中不存在。

Note that this only works if the delimiter begins at the start of the string请注意,这仅适用于分隔符在字符串开头的情况

So...A little hacky but you can do this in two steps所以......有点hacky但你可以分两步完成

 1. Find and replace all matches with (the match plus some custom character sequence or "\n").

 2. Split the new string by the custom sequence.

I did mine like this:我是这样做的:

delimiter = "\n"   # or some custom pattern that won't occur in the string 

def break_line(match):
   return delimiter + match.group()


lines = re.sub(regex_pattern, break_line, text_you_want_to_split)
lines = re.split(delimiter, lines)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM