简体   繁体   中英

Python 2.7 regex with format multiple list items

Imagine I want to find all time expressions referring to 'AM' and 'PM' in a string. Let's ignore for the moment that I could use '[AP]M' to do this (because I'm actually pulling the list of valid strings ['AM','PM'] from a dictionary whose keys are language codes). I'd like to look for both at once, like this:

foo = ['am','pm']
separator = ':'
timex = re.compile('(1[012]|[1-9])%s([0-5][0-9])( %s)?' % (separator, foo), re.I)

bar = "It's 6:00 pm, do you know where your brain is?"

timex as written above doesn't get me what I'm after: it only matches to the 'p' in 'pm'. (It seems to be treating all the chars of the list elements as though they were [ampm].)

What I don't want is to do two passes over the string (one each for 'am' and 'pm').

Is there a nice Pythonic way to do a single pass for every item in foo?

Here's the way I've inserted a list of arbitrary regex terms to be searched for:

import re

foo = ['am','pm']
timex = re.compile('({foo})'.format(foo='|'.join(foo)))

bar = "It's 6:00 pm, do you know where your brain is?"

timex.findall(bar)

returns

['pm']

You can capture more:

>>> timex = re.compile(r'(\d{{1,2}}:\d{{2}})\s*({foo})'.format(foo='|'.join(foo)))
>>> timex.findall(bar)
[('6:00', 'pm')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM