简体   繁体   中英

Regular expression match a or b pattern

I've got a small problem with the regular expression library in python, specifically with the match method with different patterns:

import re
files = ["noi100k_0p55m0p3_fow71f",\
     "fnoi100v5_71f60s",\
     "noi100k_0p55m0p3_151f_560s",\
     "noi110v25_560s"]

for i in files:
    keyws = i.split("_")
    for j in keyws:
        if re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j): 
            print "Results :", re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j).group(1)

The results are:

Results : 100
Results : None
Results : 100
Results : None

When I would expect:

Results : 100
Results : 100
Results : 100
Results : 110

The only match is for "noi(\\w+)k" it does not seems to test the other patterns but re.match(a|b,string) should check the a and b pattern no?

Your groups are numbered from left to right; if one of the alternatives matches, then it is that group you need to extract.

You have 5 groups, either group 1, or groups 2 and 3, or groups 4 and 5 will contain a match:

for j in keyws:
    match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
    if match: 
        results = match.group(1) or match.group(2) or match.group(4)
        print "Results :", results

would print the first matched \\w+ group in each alternative.

Demo:

>>> import re
>>> files = ["noi100k_0p55m0p3_fow71f",\
...      "fnoi100v5_71f60s",\
...      "noi100k_0p55m0p3_151f_560s",\
...      "noi110v25_560s"]
>>> for i in files:
...     keyws = i.split("_")
...     for j in keyws:
...         match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
...         if match: 
...             results = match.group(1) or match.group(2) or match.group(4)
...             print "Results :", results
... 
Results : 100
Results : 100
Results : 100
Results : 110

If you are not going to use of the other two captured (\\w+) groups, remove the parenthesis to make picking the matched group a little easier:

match = re.match(r"noi(\w+)k|fnoi(\w+)v\w+|noi(\w+)v\w+",j)
if match: 
    results = next(g for g in match.groups() if g)
    print "Results :", results

which picks the first matched group that is not empty.

Your pattern could be further simplified if you accept fnoi(\\w+)k as a possibility too:

match = re.match(r"f?noi(\w+)[kv](\w*)", j)

at which point there is only ever a .group(1) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM