简体   繁体   English

为什么这个正则表达式不起作用 ({m, n})?

[英]Why is this regular expression not working ({m, n})?

Trying to understand regular expressions and I am on the repetitions part: {m, n} .试图理解正则表达式,我在重复部分: {m, n}

I have this code:我有这个代码:

>>> p = re.compile('a{1}b{1, 3}')
>>> p.match('ab')
>>> p.match('abbb')

As you can see both the strings are not matching the pattern.如您所见,两个字符串都与模式不匹配。 Why is this happening?为什么会这样?

逗号后不应有空格, {1}是多余的。

Try尝试

p = re.compile('a{1}b{1,3}')

...and mind the space. ...并注意空间。

Remove the extra whitespace in b .删除b多余的空格。

Change:改变:

p = re.compile('a{1}b{1, 3}')

to:到:

p = re.compile('a{1}b{1,3}')
                        ^   # no whitespace

and all should be well.一切都应该很好。

You are seeing some re behaviour that is very "dark corner", nigh on a bug (or two).您正在看到一些非常“黑暗的角落”的re行为,靠近一个(或两个)错误。

# Python 2.7.1
>>> import re
>>> pat = r"b{1, 3}\Z"
>>> bool(re.match(pat, "bb"))
False
>>> bool(re.match(pat, "b{1, 3}"))
True
>>> bool(re.match(pat, "bb", re.VERBOSE))
False
>>> bool(re.match(pat, "b{1, 3}", re.VERBOSE))
False
>>> bool(re.match(pat, "b{1,3}", re.VERBOSE))
True
>>>

In other words, the pattern "b{1, 3}" matches the literal text "b{1, 3}" in normal mode, and the literal text "b{1,3}" in VERBOSE mode.换句话说,图案"b{1, 3}"相匹配的文字文本"b{1, 3}"在正常模式下,并字面文字"b{1,3}"中详细模式。

The "Law of Least Astonishment" would suggest either (1) the space in front of the 3 was ignored and it matched "b" , "bb" , or "bbb" as appropriate [preferable] or (2) an exception at compile time. “最小惊讶法则”会建议 (1) 3前面的空格被忽略并且它匹配"b""bb""bbb"视情况而定[首选] 或 (2) 编译时的异常时间。

Looking at it another way: Two possibilities: (a) The person who writes "{1, 3}" is imbued with the spirit of PEP8 and believes it is prescriptive and applies everywhere (b) The person who writes that has tested re undocumented behaviour and actually wants to match the literal text "b{1, 3}" and perversely wants to use r"b{1, 3}" instead of explicitly escaping: r"b\\{1, 3}" .看它的另一种方式:两种可能性:(一)谁写的人"{1, 3}"是一脉相承的PEP8的精神,并认为它是规范和应用无处不在(B)谁写的人已经测试re无证行为并且实际上想要匹配文字文本"b{1, 3}"并且反常地想要使用r"b{1, 3}"而不是显式转义: r"b\\{1, 3}" Seems to me that (a) is much more probable than (b), and re should act accordingly.在我看来,(a) 比 (b) 更有可能,并且re应该相应地采取行动。

Yet another perspective: When the space is reached, it has already parsed { , a string of digits, and a comma ie well into the {m,n} "operator" ... to silently ignore an unexpected character and treat it as though it was literal text is mind-boggling, perlish, etc.另一个观点:当到达空格时,它已经将{ 、一串数字和一个逗号解析为{m,n} “运算符”......这是文字令人难以置信,perlish等。

Update Bug report lodged.更新提交的错误报告

不要在{}之间插入空格。

p = re.compile('a{1}b{1,3}')

You can compile the regex with VERBOSE flag, this means most whitespace in the regex would be ignored.您可以使用 VERBOSE 标志编译正则表达式,这意味着正则表达式中的大多数空格将被忽略。 I think this is a very good practice to describe complex regular expressions in a more readable manner.我认为这是一种以更具可读性的方式描述复杂正则表达式的非常好的做法。

See here for details...详情请看这里...

Hope this helps...希望这可以帮助...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM