简体   繁体   中英

Regex multiple orders of named groups

I have a set of patterns that occur in multiple orders. I'd like to refer to each pattern with a name in order to sort them and extract their information. The below code doesn't work because a named group may only be defined once and putting a group in more than one operand of the | operator is interpreted as a redefinition.

a = r'(?P<A>AAA)'
b = r'(?P<B>BBB)'
c = r'(?P<C>CCC)'
d = r'(?P<D>DDD)'
x = r'(?P<X>XXX)'

cases = '|'.join([fr'{a}{b}{c}',
                  fr'{b}{c}{x}',
                  fr'{b}{a}{x}',
                  fr'{x}{d}{a}',
                  ...])

pattern = fr'({cases})'

result = [(x.group('A'),
           x.group('B'),
           x.group('C'),
           x.group('D'),
           x.group('x'))
          for x in re.finditer(pattern, long_string)]

Is there a way to put a group with the same name in different parts of the | operator?

Once you have defined your named group, you have to use named back references.

For exemple, to refer to the A group:

(?P=A)

在此处输入图像描述

See the documentation: https://www.regular-expressions.info/named.html?wlr=1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM