简体   繁体   English

正则表达式匹配捕获组一次或多次

[英]Regex to match a capturing group one or more times

I'm trying to match pair of digits in a string and capture them in groups, however i seem to be only able to capture the last group. 我正在尝试匹配字符串中的一对数字并将其捕获成组,但我似乎只能捕获最后一组。

Regex:
(\d\d){1,3}

Input String: 123456 789101 输入字符串:123456 789101

Match 1: 123456 比赛1:123456
Group 1: 56 第1组:56

Match 2: 789101 比赛2:789101
Group 1: 01 第1组:01

What I want is to capture all the groups like this: Match 1: 123456 我想要的是捕获所有这样的组:匹配1:123456
Group 1: 12 第1组:12
Group 2: 34 第2组:34
Group 3: 56 第3组:56

* Update *更新
It looks like Python does not let you capture multiple groups, for example in .NET you could capture all the groups in a single pass, hence re.findall('\\d\\d', '123456') does the job. 看起来Python不允许你捕获多个组,例如在.NET中你可以在一次传递中捕获所有组,因此re.findall('\\ d \\ d','123456')完成了这项工作。

You cannot do that using just a single regular expression. 你不能只使用一个正则表达式。 It is a special case of counting, which you cannot do with just a regex pattern. 这是一个特殊的计数案例,你只能使用正则表达式模式。 \\d\\d will get you: \\ d \\ d会得到你:

Group1: 12 Group2: 23 Group3: 34 ... Group1:12 Group2:23 Group3:34 ......

regex library in python comes with a non-overlapping routine namely re.findall() that does the trick. python中的正则表达式库带有一个非重叠的例程,即re.findall(),可以解决这个问题。 as in: 如:

     re.findall('\d\d', '123456')

will return ['12', '34', '56'] 将返回['12', '34', '56']

(\d{2})+(\d)?

我不确定python如何处理它的匹配,但这就是我要做的

Try this: 尝试这个:

import re
re.findall(r'\d\d','123456')

Is this what you want ? 这是你想要的吗 ? :

import re

regx = re.compile('(?:(?<= )|(?<=\A)|(?<=\r)|(?<=\n))'
                  '(\d\d)(\d\d)?(\d\d)?'
                  '(?= |\Z|\r|\n)')

for s in ('   112233  58975  6677  981  897899\r',
          '\n123456 4433 789101 41586 56 21365899 362547\n',
          '0101 456899 1 7895'):
    print repr(s),'\n',regx.findall(s),'\n'

result 结果

'   112233  58975  6677  981  897899\r' 
[('11', '22', '33'), ('66', '77', ''), ('89', '78', '99')] 

'\n123456 4433 789101 41586 56 21365899 362547\n' 
[('12', '34', '56'), ('44', '33', ''), ('78', '91', '01'), ('56', '', ''), ('36', '25', '47')] 

'0101 456899 1 7895' 
[('01', '01', ''), ('45', '68', '99'), ('78', '95', '')] 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM