I've got a small problem with the regular expression library in python, specifically with the match method with different patterns:
import re
files = ["noi100k_0p55m0p3_fow71f",\
"fnoi100v5_71f60s",\
"noi100k_0p55m0p3_151f_560s",\
"noi110v25_560s"]
for i in files:
keyws = i.split("_")
for j in keyws:
if re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j):
print "Results :", re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j).group(1)
The results are:
Results : 100
Results : None
Results : 100
Results : None
When I would expect:
Results : 100
Results : 100
Results : 100
Results : 110
The only match is for "noi(\\w+)k"
it does not seems to test the other patterns but re.match(a|b,string)
should check the a
and b
pattern no?
Your groups are numbered from left to right; if one of the alternatives matches, then it is that group you need to extract.
You have 5 groups, either group 1, or groups 2 and 3, or groups 4 and 5 will contain a match:
for j in keyws:
match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
if match:
results = match.group(1) or match.group(2) or match.group(4)
print "Results :", results
would print the first matched \\w+
group in each alternative.
Demo:
>>> import re
>>> files = ["noi100k_0p55m0p3_fow71f",\
... "fnoi100v5_71f60s",\
... "noi100k_0p55m0p3_151f_560s",\
... "noi110v25_560s"]
>>> for i in files:
... keyws = i.split("_")
... for j in keyws:
... match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
... if match:
... results = match.group(1) or match.group(2) or match.group(4)
... print "Results :", results
...
Results : 100
Results : 100
Results : 100
Results : 110
If you are not going to use of the other two captured (\\w+)
groups, remove the parenthesis to make picking the matched group a little easier:
match = re.match(r"noi(\w+)k|fnoi(\w+)v\w+|noi(\w+)v\w+",j)
if match:
results = next(g for g in match.groups() if g)
print "Results :", results
which picks the first matched group that is not empty.
Your pattern could be further simplified if you accept fnoi(\\w+)k
as a possibility too:
match = re.match(r"f?noi(\w+)[kv](\w*)", j)
at which point there is only ever a .group(1)
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.