简体   繁体   English

Python re模块 - 使用'或'时重复的符号限制

[英]Python re module - repeated symbols limit when using 'or'

I have a question. 我有个问题。 Lets say there are such patterns: 让我们说有这样的模式:

>>> import re
>>> pt ='^a{1,2}$'
>>> re.search(pt, 'aa') # looks ok
<_sre.SRE_Match object at 0x020B2288>
>>> re.search(pt, 'aaa') # ok too
>>>

Now lets try to look for match with another pattern: 现在让我们尝试寻找与另一种模式的匹配:

>>> pt = '^a{1,2}|x$'
>>> re.search(pt, 'a') # this one looks ok
<_sre.SRE_Match object at 0x020B25D0>
>>> re.search(pt, 'aaax') # (1) Now this one?
<_sre.SRE_Match object at 0x020B2288>
>>> re.search(pt, 'aaaaaax') # (2) and this one?
<_sre.SRE_Match object at 0x020B25D0>
>>> re.search(pt, 'aaa') # (3) and this one?
<_sre.SRE_Match object at 0x020B25D0>

(1)(2)(3) To me it looks like it should match string that starts with one or two 'a' or one 'x' or both combinations and ends between these letters, but nothing else. (1)(2)(3)对我而言,它看起来应该匹配以一个或两个'a'或一个'x'开头的字符串或两个组合之间的字符串结尾,但没有别的。 Or I don't get it something? 或者我不明白吗? Is it should be like that? 它应该是那样的吗? Like when you use '|', it ignores what limit is put inside {}? 就像你使用'|'时一样,它会忽略{}内的限制? Can someone explain me this? 有人可以解释一下吗?

The $ is affected by the grouping. $受分组影响。 Your regex is interpreted as (^a{1,2})|(x$) , which matches either "one or two as at the beginning of the string" OR "an x at the end of the string". 你的正则表达式被解释为(^a{1,2})|(x$) ,它匹配“字符串开头的一个或两个”或“字符串末尾的x”。 If you want to have the | 如果你想拥有| apply only to the as and xs, you need to group them: 仅适用于as和xs,您需要对它们进行分组:

pt = '^(a{1,2}|x)$'

Or, if you don't want to capture the group, use a noncapturing group: 或者,如果您不想捕获该组,请使用非捕获组:

pt = '^(?:a{1,2}|x)$'

Edit: I'm not sure I understand what you're trying to match, but perhaps try: 编辑:我不确定我理解你想要匹配的内容,但也许可以尝试:

pt = '^(a{1,2}x?|x)$"

^foo|bar$匹配^foobar$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM