简体   繁体   中英

Return groups in Regular Expression matched by string

Suppose we have the following regular expression:

[ABC]{1,3}.{0,2}[DEFG].{2,3}V

We can test with Python's re module if this matches the following string:

AXDXXV

It does match. Then, using Python, how could we retrieve each part of the regular expression matched by each part of the string?

For example, the following output list would work:

[ '[ABC]{1,3}', '.{0,2}', '[DEFG]', '.{2,3}', 'V' ]

You may use named capturing groups and after obtaining a match you will be able to get the values mapped to those names (with groupDict() ). I also advise to build such a pattern dynamically as an OrderedDict .

See a Python 2.7 demo :

import re, collections

# Define the pattern parts with named capturing groups
parts = [('p1', r'(?P<p1>[ABC]{1,3})'),
    ('p2', r'(?P<p2>.{0,2})'),
    ('p3', r'(?P<p3>[DEFG])'),
    ('p4', r'(?P<p4>.{2,3})'),
    ('p5', r'(?P<v>V)')]
# Create and init the OrderedDict
pod = collections.OrderedDict(parts)
# Build the pattern from values (in Python 3, use list(pod.items()) )
reg = "".join([v for k,v in pod.items()])
test_str = "AXDXXV"
# Find a match
m = re.search(reg, test_str)
if m:
    # If a match is found, get the groupdict()
    m_dict = m.groupdict()
    print(m_dict)
    print("{} => {}".format(m.group("p1"), pod["p1"]))

The regex will look like (?P<p1>[ABC]{1,3})(?P<p2>.{0,2})(?P<p3>[DEFG])(?P<p4>.{2,3})(?P<v>V) , and once a match is found, you will get something like {'p2': 'X', 'p3': 'D', 'p1': 'A', 'p4': 'XX', 'v': 'V'} . Then, you may always check the underlying pattern with a value with "{} => {}".format(m.group("p1"), pod["p1"]) (eg A => (?P<p1>[ABC]{1,3}) ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM