简体   繁体   中英

python match regular expression

i need to compare a subject with a regex, and link the occurrences with a coincident key mask

key_mask = 'foo/{one}/bar/{two}/hello/{world}'

regex_mask = 'foo/(.*)/bar/(.*)/hello/(.*)'

subject = 'foo/test/bar/something/xxx'

the return should be:

{
"one": "test",
"two": "something",
"world": "xxx"
}

what is the best way to accomplish this result with the 3 inputs?

(this is for a simple url routing filtering like symfony http://symfony.com/doc/current/book/routing.html )

thanks!

The simplest thing that comes to mind is to use named groups in regular expression:

>>> regex_mask = 'foo/(?P<one>.*)/bar/(?P<two>.*)/hello/(?P<world>.*)'
>>> subject = 'foo/test/bar/something/hello/xxx'
>>> re.match(regex_mask, subject).groupdict()
{'world': 'xxx', 'two': 'something', 'one': 'test'}

The simplest way would be to use named-groups, ie instead of a plain (.*) use (?P<name>.*) and then use the groupdict() method of the Match objects.

However, if you cannot change the inputs to your problem(because you are getting them from another library or whatever other reason, you can automatically create a named-group regex from the key_mask using re.sub and using a simple function as repl :

import re

def to_named_group(match):
    return '(?P<{}>.*)'.format(re.escape(match.group(0)[1:-1]))

def make_regex(key_mask):
    return re.compile(re.sub(r'\{[^}]+\}', to_named_group, key_mask))

def find_matches(key_mask, text):
    return make_regex(key_mask).match(text).groupdict()

Used as:

In [10]: find_matches('foo/{one}/bar/{two}/hello/{world}', 'foo/test/bar/something/hello/xxx')
Out[10]: {'one': 'test', 'two': 'something', 'world': 'xxx'}

Update based on your comment:

It's easy to pass into to_named_group further information on the regexes to produce. For example you could change the code to:

import re
from functools import partial

def to_named_groups(match, regexes):
    group_name = re.escape(match.group(0)[1:-1])
    group_regex = regexes.get(group_name, '.*')
    return '(?P<{}>{})'.format(group_name, group_regex)

def make_regex(key_mask, regexes):
    regex = re.sub(r'\{[^}]+\}', partial(to_named_groups, regexes=regexes),
                   key_mask)
    return re.compile(regex)

def find_matches(key_mask, text, regexes=None):
    if regexes is None:
        regexes = {}
    try:
        return make_regex(key_mask, regexes).search(text).groupdict()
    except AttributeError:
        return None

In this way you can control what should be matched by each named-group.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM