I have several strings, and have identified some formats of date on them, and would like to recognize date on each string
an_2011_02_12_azar.mp3 ->this is yyyy_mm_dd
20121112_Marcel.mp3 ->this is yyyymmdd
cdani_270607.mp3 ->this is ddmmyy
lica_07_03_15.mp3 ->this is dd_mm_yy
to do so I have:
foo = """
an_2011_02_12_azar.mp3
20121112_Marcel.mp3
cdani_270607.mp3
lica_07_03_15.mp3
"""
try:
lines = foo.split('\n')
except AttributeError:
lines = x
for line in lines:
print(line)
#deals with 2011_02_12 format
match = re.search(r'\d{4}_\d{2}_\d{2}', line)
date = datetime.datetime.strptime(match.group(), '%Y_%m_%d').date()
print(date)
How to apply several regular expressions so it can recognize dates?
If you remove the underscores:
datestr = line.replace('_', '')
then there would be only two date formats to deal with: yyyymmdd
or ddmmyy
. Furthermore, every date string would consist of 6 to 8 digits which you could find using the regex pattern r'\\d{8}|\\d{6}'
:
datestr = re.search(r'\d{8}|\d{6}', datestr).group()
The datestr
could then be parsed with either
date = DT.datetime.strptime(datestr, '%d%m%y')
or
date = DT.datetime.strptime(datestr, '%Y%m%d')
The pattern r'\\d{8}|\\d{6}'
would also capture some possibly non-date-like strings, such digits which represent invalid dates. We could deal with those cases by using try..except
to catch ValueErrors
.
import re
import datetime as DT
foo = """\
an_2011_02_12_azar.mp3
20121112_Marcel.mp3
cdani_270607.mp3
lica_07_03_15.mp3
an_2011_13_12_azar.mp3
"""
for line in foo.splitlines():
datestr = line.replace('_', '')
datestr = re.search(r'\d{8}|\d{6}', datestr).group()
try:
# %y matches 2-digit years
date = DT.datetime.strptime(datestr, '%d%m%y')
except ValueError:
try:
# %Y matches 4-digit years
date = DT.datetime.strptime(datestr, '%Y%m%d')
except ValueError:
# handle the error case
date = None
print('{:23} --> {}'.format(line, date))
yields
an_2011_02_12_azar.mp3 --> 2011-02-12 00:00:00
20121112_Marcel.mp3 --> 2012-11-12 00:00:00
cdani_270607.mp3 --> 2007-06-27 00:00:00
lica_07_03_15.mp3 --> 2015-03-07 00:00:00
an_2011_13_12_azar.mp3 --> None
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.