[英]Python split string with split character and escape character
In python, how can I split a string with an regex by the following ruleset:在 python 中,如何通过以下规则集使用正则表达式拆分字符串:
;
)由拆分字符拆分(例如;
):
).如果该拆分字符被转义字符(例如:
)转义,则不要拆分。So splitting所以分裂
"foo;bar:;baz::;one:two;::three::::;four;;five:::;six;:seven;::eight"
should yield应该产生
["foo", "bar:;baz::", "one:two", "::three::::", "four", "", "five:::;six", ":seven", "::eight"]
My own attempt was:我自己的尝试是:
re.split(r'(?<!:);', str)
Which cannot handle rule #3哪个不能处理规则#3
If matching is also an option, and the empty match ''
is not required:如果匹配也是一个选项,并且空匹配''
不是必需的:
(?::[:;]|[^;\n])+
(?:
Non capture group (?:
非捕获组
:[:;]
Match :
followed by either :
or ;
:[:;]
匹配:
后跟:
或;
|
Or或者[^;\n]
Match 1+ times any char except ;
[^;\n]
匹配除;
以外的任何字符 1 次以上or a newline或换行符)+
Close non capture group and repeat 1+ times )+
关闭非捕获组并重复 1+ 次import re
regex = r"(?::[:;]|[^;\n])+"
str = "foo;bar:;baz::;one:two;::three::::;four;;five:::;six;:seven;::eight"
print(re.findall(regex, str))
Output Output
['foo', 'bar:;baz::', 'one:two', '::three::::', 'four', 'five:::;six', ':seven', '::eight']
If you want the empty match, you could add 2 lookarounds to get the position where there is a ;
如果您想要空匹配,您可以添加 2 个环视来获得 position 有一个;
to the left and right向左和向右
(?::[:;]|[^;\n]|(?<=;)(?=;))+
You could use regex
module with the following pattern to split on:您可以使用具有以下模式的regex
模块进行拆分:
(?<!:)(?:::)*\K;
(?<::)
- Negative lookbehind. (?<::)
- 消极的后视。(?:::)*
- A non capturing group for 0+ times 2 literal colons. (?:::)*
- 0+ 乘以 2 个文字冒号的非捕获组。\K
- Reset starting point of reported match. \K
- 重置报告匹配的起点。;
- A literal semi-colon. - 文字分号。For example:例如:
import regex as re
s = 'foo;bar:;baz::;one:two;::three::::;four;;five:::;six;:seven;::eight'
lst = re.split(r'(?<!:)(?:::)*\K;', s)
print(lst) # ['foo', 'bar:;baz::', 'one:two', '::three::::', 'four', '', 'five:::;six', ':seven', '::eight']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.