[英]Python re module - repeated symbols limit when using 'or'
I have a question. 我有个问题。 Lets say there are such patterns:
让我们说有这样的模式:
>>> import re
>>> pt ='^a{1,2}$'
>>> re.search(pt, 'aa') # looks ok
<_sre.SRE_Match object at 0x020B2288>
>>> re.search(pt, 'aaa') # ok too
>>>
Now lets try to look for match with another pattern: 现在让我们尝试寻找与另一种模式的匹配:
>>> pt = '^a{1,2}|x$'
>>> re.search(pt, 'a') # this one looks ok
<_sre.SRE_Match object at 0x020B25D0>
>>> re.search(pt, 'aaax') # (1) Now this one?
<_sre.SRE_Match object at 0x020B2288>
>>> re.search(pt, 'aaaaaax') # (2) and this one?
<_sre.SRE_Match object at 0x020B25D0>
>>> re.search(pt, 'aaa') # (3) and this one?
<_sre.SRE_Match object at 0x020B25D0>
(1)(2)(3) To me it looks like it should match string that starts with one or two 'a' or one 'x' or both combinations and ends between these letters, but nothing else. (1)(2)(3)对我而言,它看起来应该匹配以一个或两个'a'或一个'x'开头的字符串或两个组合之间的字符串结尾,但没有别的。 Or I don't get it something?
或者我不明白吗? Is it should be like that?
它应该是那样的吗? Like when you use '|', it ignores what limit is put inside {}?
就像你使用'|'时一样,它会忽略{}内的限制? Can someone explain me this?
有人可以解释一下吗?
The $
is affected by the grouping. $
受分组影响。 Your regex is interpreted as (^a{1,2})|(x$)
, which matches either "one or two as at the beginning of the string" OR "an x at the end of the string". 你的正则表达式被解释为
(^a{1,2})|(x$)
,它匹配“字符串开头的一个或两个”或“字符串末尾的x”。 If you want to have the |
如果你想拥有
|
apply only to the as and xs, you need to group them: 仅适用于as和xs,您需要对它们进行分组:
pt = '^(a{1,2}|x)$'
Or, if you don't want to capture the group, use a noncapturing group: 或者,如果您不想捕获该组,请使用非捕获组:
pt = '^(?:a{1,2}|x)$'
Edit: I'm not sure I understand what you're trying to match, but perhaps try: 编辑:我不确定我理解你想要匹配的内容,但也许可以尝试:
pt = '^(a{1,2}x?|x)$"
^foo|bar$
匹配^foo
或bar$
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.