[英]Python Regex - Split string after every character
I have a string that follows the pattern of a 1+
numbers followed by a single letter, 'a', 'b', 'c'
. 我有一个字符串,遵循
1+
数字的模式,后跟一个字母, 'a', 'b', 'c'
。 I want to split the string after every letter. 我想在每个字母后分割字符串。
some_function('12a44b65c')
>>> ['12a', '44b', '65c']
I've tried so far 到目前为止我已经尝试过了
re.split('([abc]\d+)', '12a44b65c')
>>> ['12', 'a44', '', 'b65', 'c']
Your regex is backwards - it should be any number of digits followed by an a
, b
or a c
. 你的正则表达式是向后的 - 它应该是任意数量的数字,后跟
a
, b
或c
。 additionally, I wouldn't use split
, which returns annoying empty strings, but findall
: 另外,我不会使用
split
,它返回烦人的空字符串,但是findall
:
>>> re.findall('(\d+[abc])', '12a44b65c')
['12a', '44b', '65c']
If you're able to use the newer regex module , you can even split on zero-width matches (with lookarounds, that is). 如果你能够使用更新的正则表达式模块 ,你甚至可以在零宽度匹配上进行拆分(即使用外观)。
import regex as re
rx = r'(?V1)(?<=[a-z])(?=\d)'
string = "12a44b65c"
parts = re.split(rx, string)
print parts
# ['12a', '44b', '65c']
This approach looks for one of az
behind and a digit ( \\d
) immediately ahead. 这种方法在后面查找
az
中的一个和一个数字( \\d
)。
The original re.split()
does not allow zero-width matches, for compatibility you explicitely need to turn the new behaviour on with (?V1)
in the pattern. 原始的
re.split()
不允许零宽度匹配,为了兼容性,您明确需要在模式中使用(?V1)
打开新行为。
See a demo on regex101.com . 请参阅regex101.com上的演示 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.