简体   繁体   中英

Why doesn't this regex match the full patterns?

import re

patterns = re.compile(r'(yesterday|today) \d{1,2} hours \d{1,2} minutes')

matches = re.findall(patterns, 'yesterday 9 hours 32 minutes today 10 hours 30 minutes')

print(matches)

The print output of the code above is:

['yesterday', 'today']

I hope it is:

['yesterday 9 hours 32 minutes', 'today 10 hours 30 minutes']

Why doesn't it match the full patterns?

You are using your initial capture group to designate the choice between yesterday and today:

(yesterday|today) -- grouping is a valid use of the capture group, but in this case it's having the unintended consequence of confusing you.

You can handle this several ways. The following will get you the result you want using finditer and a reference to .group(0) which always indicates the full matched text:

import re

patterns = re.compile(r'(yesterday|today) \d{1,2} hours \d{1,2} minutes')

matches = patterns.finditer('yesterday 9 hours 32 minutes today 10 hours 30 minutes')
for match in matches:
    print(match.group(0))

You could also do something like:

import re

patterns = re.compile(r'(?:yesterday|today) \d{1,2} hours \d{1,2} minutes')

matches = patterns.findall('yesterday 9 hours 32 minutes today 10 hours 30 minutes')
print(matches)

Which will convert the capture group to a non-capture group.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM