简体   繁体   English

python,正则表达式,命名组和“逻辑或”运算符

[英]python, regular expressions, named groups and “logical or” operator

In python regular expression, named and unnamed groups are both defined with '(' and ')'. 在python正则表达式中,命名和未命名组都使用'('和')'定义。 This leads to a weird behavior. 这导致了一种奇怪的行为。 Regexp 正则表达式

"(?P<a>1)=(?P<b>2)"

used with text "1=2" will find named group "a" with value "1" and named group "b" with value "2". 与文本“1 = 2”一起使用时,将找到值为“1”的命名组“a”,并将值“2”命名为组“b”。 But if i want to use "logical or" operator and concatenate multiple rules, the following regexp: 但是,如果我想使用“逻辑或”运算符并连接多个规则,请使用以下正则表达式:

"((?P<a>1)=(?P<b>2))|(?P<c>3)"

used with same text "1=2" will find an unnamed group with value "1=2". 使用相同文本“1 = 2”将找到一个值为“1 = 2”的未命名组。 I understood that regexp engine treats "(" and ")" that encloses groups "a" and "b" as an unnamed group and reports that it is found. 我知道regexp引擎会将“(”和“)”视为一个未命名的组,并将“a”和“b”组合在一起,并报告它已找到。 But i don't want an unnamed groups to be reported, i just want to use "|" 但我不希望报告一个未命名的组,我只想使用“|” in order to "glue" multiple regexps together. 为了将多个正则表达式“粘合”在一起。 Without creating any parasitic unnamed groups. 没有创建任何寄生的未命名组。 Is it a way to do so in python? 这是在python中这样做的方法吗?

Use (?:) to get rid of the unnamed group: 使用(?:)删除未命名的组:

r"(?:(?P<a>1)=(?P<b>2))|(?P<c>3)"

From the documentation of re : re的文档:

(?:...) A non-grouping version of regular parentheses. (?:...)常规括号的非分组版本。 Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern. 匹配括号内的正则表达式,但在执行匹配或稍后在模式中引用后,无法检索组匹配的子字符串。

By the way, the alternation operator | 顺便说一下,交替运算符| has very low precedence in order to make parentheses unnecessary in cases like yours. 优先级非常低,以便在像你这样的情况下不需要括号。 You can drop the extra parentheses in your regex and it will continue to work as expected: 您可以在正则表达式中删除额外的括号,它将继续按预期工作:

r"(?P<a>1)=(?P<b>2)|(?P<c>3)"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM