简体   繁体   English

正则表达式组匹配

[英]Regular expression group matching

I am trying to search for sequence of binary digits separated by white space like this: 我试图搜索由空格分隔的二进制数字序列,如下所示:

>>> seq = '0 1 1 1 0 0 1 0'

so, I create the regex: 所以,我创建了正则表达式:

>>> pat = r'(\b[01]\b)+'

but following search returns only one digit: 但是在搜索后只返回一位数字:

>>> re.search(pat, seq).group(0)
'0'

What's wrong? 怎么了?

You're very close, just missing a space in the pattern. 你非常接近,只是错过了模式中的空间。 Try pat = r'\\b([01] )*[01]\\b' 尝试pat = r'\\b([01] )*[01]\\b'

>>> import re
>>> seq = '0 1 1 1 0 0 1 0'
>>> pat = r'\b([01] )*[01]\b'
>>> re.search(pat, seq).group(0)
'0 1 1 1 0 0 1 0'
>>> re.search(pat, 'spam and 0 0 0 1 0eggs').group(0)
'0 0 0 1'

Your current regex has no way to match the whitespace, so it can only match a single character. 您当前的正则表达式无法匹配空白,因此它只能匹配单个字符。 You can either use the same regex with re.findall() to get all matches in the string, or modify your regex so it will continue matching even if it encounters white space. 您可以使用与re.findall()相同的正则表达式来获取字符串中的所有匹配项,或修改您的正则表达式,以便即使遇到空格也会继续匹配。

Here is an example using re.findall() : 以下是使用re.findall()的示例:

>>> re.findall(r'(\b[01]\b)+', '0 1 1 1 0 0 1 0')
['0', '1', '1', '1', '0', '0', '1', '0']

Or by changing the regex to (\\b[01]\\b\\s?)+ you can get the entire sequence in a single match: 或者通过将正则表达式更改为(\\b[01]\\b\\s?)+您可以在单个匹配中获取整个序列:

>>> re.search(r'(\b[01]\b\s?)+', '0 1 1 1 0 0 1 0').group(0)
'0 1 1 1 0 0 1 0'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM