简体   繁体   中英

Return groups in Regular Expression matched by string

Suppose we have the following regular expression:


We can test with Python's re module if this matches the following string:


It does match. Then, using Python, how could we retrieve each part of the regular expression matched by each part of the string?

For example, the following output list would work:

[ '[ABC]{1,3}', '.{0,2}', '[DEFG]', '.{2,3}', 'V' ]

You may use named capturing groups and after obtaining a match you will be able to get the values mapped to those names (with groupDict() ). I also advise to build such a pattern dynamically as an OrderedDict .

See a Python 2.7 demo :

import re, collections

# Define the pattern parts with named capturing groups
parts = [('p1', r'(?P<p1>[ABC]{1,3})'),
    ('p2', r'(?P<p2>.{0,2})'),
    ('p3', r'(?P<p3>[DEFG])'),
    ('p4', r'(?P<p4>.{2,3})'),
    ('p5', r'(?P<v>V)')]
# Create and init the OrderedDict
pod = collections.OrderedDict(parts)
# Build the pattern from values (in Python 3, use list(pod.items()) )
reg = "".join([v for k,v in pod.items()])
test_str = "AXDXXV"
# Find a match
m = re.search(reg, test_str)
if m:
    # If a match is found, get the groupdict()
    m_dict = m.groupdict()
    print("{} => {}".format(m.group("p1"), pod["p1"]))

The regex will look like (?P<p1>[ABC]{1,3})(?P<p2>.{0,2})(?P<p3>[DEFG])(?P<p4>.{2,3})(?P<v>V) , and once a match is found, you will get something like {'p2': 'X', 'p3': 'D', 'p1': 'A', 'p4': 'XX', 'v': 'V'} . Then, you may always check the underlying pattern with a value with "{} => {}".format(m.group("p1"), pod["p1"]) (eg A => (?P<p1>[ABC]{1,3}) ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM