正则表达式找到贪婪和懒惰的匹配以及所有介于两者之间的匹配

Question

I have a sequence like such '01 02 09 02 09 02 03 05 09 08 09 ' , and I want to find a sequence that starts with 01 and ends with 09 , and in-between there can be one to nine double-digit, such as 02 , 03 , 04 etc. This is what I have tried so far.我有一个像这样的序列'01 02 09 02 09 02 03 05 09 08 09 ' ，我想找到一个以01开头并以09结尾的序列，并且中间可以有 1 到 9 个两位数，例如02 ， 03 ， 04等。这是我到目前为止尝试过的。

I'm using w{2}\s ( w{2} for matching the two digits, and \s for the whitespace).我正在使用w{2}\s （ w{2}用于匹配两个数字，而\s用于空白）。 This can occur one to nine times, which leads to (\w{2}\s){1,9} .这可能会发生一到九次，从而导致(\w{2}\s){1,9} 。 The whole regex becomes (01\s(\w{2}\s){1,9}09\s) .整个正则表达式变为(01\s(\w{2}\s){1,9}09\s) 。 This returns the following result:这将返回以下结果：

<regex.Match object; span=(0, 33), match='01 02 09 02 09 02 03 05 09 08 09 '>

If I use the lazy quantifier ?如果我使用惰性量词? , it returns the following result: ，它返回以下结果：

<regex.Match object; span=(0, 9), match='01 02 09 '>

How can I obtain the results in-between too.我怎样才能获得中间的结果。 The desired result would include all the following:期望的结果将包括以下所有内容：

<regex.Match object; span=(0, 9), match='01 02 09 '>
<regex.Match object; span=(0, 15), match='01 02 09 02 09 '>
<regex.Match object; span=(0, 27), match='01 02 09 02 09 02 03 05 09 '>
<regex.Match object; span=(0, 33), match='01 02 09 02 09 02 03 05 09 08 09 '>

Answer 1

You can extract these strings using您可以使用提取这些字符串

import re
s = "01 02 09 02 09 02 03 05 09 08 09 "
m = re.search(r'01(?:\s\w{2})+\s09', s)
if m:
    print( [x[::-1] for x in re.findall(r'(?=\b(90.*?10$))', m.group()[::-1])] )
# => ['01 02 09 02 09 02 03 05 09 08 09', '01 02 09 02 09 02 03 05 09', '01 02 09 02 09', '01 02 09']

See the Python demo .请参阅Python 演示。

With the 01(?:\s\w{2})+\s09 pattern and re.search , you can extract the substrings from 01 to the last 09 (with any space separated two word char chunks in between).使用01(?:\s\w{2})+\s09模式和re.search ，您可以提取从01到最后一个09的子字符串（中间有任何空格分隔两个单词字符块）。

The second step - [x[::-1] for x in re.findall(r'(?=\b(90.*?10$))', m.group()[::-1])] - is to reverse the string and the pattern to get all overlapping matches from 09 to 01 and then reverse them to get final strings.第二步—— [x[::-1] for x in re.findall(r'(?=\b(90.*?10$))', m.group()[::-1])] - 是将字符串和模式反转得到从09到01的所有重叠匹配，然后反转它们得到最终的字符串。

You may also reverse the final list if you add [::-1] at the end of the list comprehension: print( [x[::-1] for x in re.findall(r'(?=\b(90.*?10$))', m.group()[::-1])][::-1] ) .如果在列表理解的末尾添加[::-1] ，也可以反转最终列表： print( [x[::-1] for x in re.findall(r'(?=\b(90.*?10$))', m.group()[::-1])][::-1] ) 。

Answer 2

Here would be a non-regex answer that post-processes the matching elements:这将是一个非正则表达式的答案，它对匹配元素进行后处理：

s = '01 02 09 02 09 02 03 05 09 08 09 '.trim().split()
assert s[0] == '01'        \
   and s[-1] == '09'       \
   and (3 <= len(s) <= 11) \
   and len(s) == len([elem for elem in s if len(elem) == 2 and elem.isdigit() and elem[0] == '0'])
[s[:i+1] for i in sorted({s.index('09', i) for i in range(2,len(s))})]
# [
#    ['01', '02', '09'], 
#    ['01', '02', '09', '02', '09'], 
#    ['01', '02', '09', '02', '09', '02', '03', '05', '09'],
#    ['01', '02', '09', '02', '09', '02', '03', '05', '09', '08', '09']
# ]

正则表达式找到贪婪和懒惰的匹配以及所有介于两者之间的匹配

问题描述

2 个解决方案

解决方案1
0 已采纳 2022-03-01 16:00:48

解决方案2
0 2022-03-01 16:36:09

正则表达式找到贪婪和懒惰的匹配以及所有介于两者之间的匹配

问题描述

2 个解决方案

解决方案1 0 已采纳 2022-03-01 16:00:48

解决方案2 0 2022-03-01 16:36:09

解决方案1
0 已采纳 2022-03-01 16:00:48

解决方案2
0 2022-03-01 16:36:09