简体   繁体   English

python返回匹配和不匹配的字符串模式

[英]python return matching and non-matching patterns of string

I would like to split a string into parts that match a regexp pattern and parts that do not match into a list. 我想将一个字符串拆分为与正则表达式模式匹配的部分以及与列表不匹配的部分。

For example 例如

import re
string = 'my_file_10'
pattern = r'\d+$'
#  I know the matching pattern can be obtained with :
m = re.search(pattern, string).group()
print m
'10'
#  The final result should be as following
['my_file_', '10']

Put parenthesis around the pattern to make it a capturing group, then use re.split() to produce a list of matching and non-matching elements: 在模式周围放置括号以使其成为捕获组,然后使用re.split()生成匹配和非匹配元素的列表:

pattern = r'(\d+$)'
re.split(pattern, string)

Demo: 演示:

>>> import re
>>> string = 'my_file_10'
>>> pattern = r'(\d+$)'
>>> re.split(pattern, string)
['my_file_', '10', '']

Because you are splitting on digits at the end of the string, an empty string is included. 因为您在字符串末尾的数字上拆分,所以包含一个空字符串。

If you only ever expect one match, at the end of the string (which the $ in your pattern forces here), then just use the m.start() method to obtain an index to slice the input string: 如果你只期望一个匹配,在字符串的末尾(你的模式中的$强制在这里),那么只需使用m.start()方法来获取切片输入字符串的索引:

pattern = r'\d+$'
match = re.search(pattern, string)
not_matched, matched = string[:match.start()], match.group()

This returns: 返回:

>>> pattern = r'\d+$'
>>> match = re.search(pattern, string)
>>> string[:match.start()], match.group()
('my_file_', '10')

You can use re.split to make a list of those separate matches and use filter , which filters out all elements which are considered false ( empty strings ) 您可以使用re.split来创建这些单独匹配的列表并使用filter ,它会过滤掉所有被视为false的元素( 空字符串

>>> import re
>>> filter(None, re.split(r'(\d+$)', 'my_file_015_01'))
['my_file_015_', '01']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM