简体   繁体   English

从Python中的字符串中提取不同格式的日期

[英]Extract different format date from a string in Python

I have several strings, and have identified some formats of date on them, and would like to recognize date on each string 我有几个字符串,并且已经确定了它们上日期的某些格式,并且想识别每个字符串上的日期

an_2011_02_12_azar.mp3 ->this is yyyy_mm_dd
20121112_Marcel.mp3    ->this is yyyymmdd
cdani_270607.mp3       ->this is ddmmyy
lica_07_03_15.mp3      ->this is dd_mm_yy

to do so I have: 为此,我有:

foo = """
an_2011_02_12_azar.mp3
20121112_Marcel.mp3   
cdani_270607.mp3     
lica_07_03_15.mp3  
"""
try:
    lines = foo.split('\n')
except AttributeError:
    lines = x
for line in lines:
     print(line)
     #deals with 2011_02_12 format
     match = re.search(r'\d{4}_\d{2}_\d{2}', line)
     date = datetime.datetime.strptime(match.group(), '%Y_%m_%d').date()
     print(date)

How to apply several regular expressions so it can recognize dates? 如何应用几个正则表达式以便可以识别日期?

If you remove the underscores: 如果删除下划线:

datestr = line.replace('_', '')

then there would be only two date formats to deal with: yyyymmdd or ddmmyy . 那么只有两种日期格式可以处理: yyyymmddddmmyy Furthermore, every date string would consist of 6 to 8 digits which you could find using the regex pattern r'\\d{8}|\\d{6}' : 此外,每个日期字符串都将包含6到8位数字,您可以使用正则表达式模式r'\\d{8}|\\d{6}'

datestr = re.search(r'\d{8}|\d{6}', datestr).group()

The datestr could then be parsed with either 然后可以用以下任一方法解析datestr

date = DT.datetime.strptime(datestr, '%d%m%y')

or 要么

date = DT.datetime.strptime(datestr, '%Y%m%d')

The pattern r'\\d{8}|\\d{6}' would also capture some possibly non-date-like strings, such digits which represent invalid dates. 模式r'\\d{8}|\\d{6}'还将捕获一些可能不类似于日期的字符串,例如表示无效日期的数字。 We could deal with those cases by using try..except to catch ValueErrors . 我们可以使用try..except来捕获ValueErrors来处理这些情况。


import re
import datetime as DT

foo = """\
an_2011_02_12_azar.mp3
20121112_Marcel.mp3   
cdani_270607.mp3     
lica_07_03_15.mp3  
an_2011_13_12_azar.mp3
"""

for line in foo.splitlines():
    datestr = line.replace('_', '')
    datestr = re.search(r'\d{8}|\d{6}', datestr).group()
    try:
        # %y matches 2-digit years
        date = DT.datetime.strptime(datestr, '%d%m%y')
    except ValueError:
        try:
            # %Y matches 4-digit years
            date = DT.datetime.strptime(datestr, '%Y%m%d')
        except ValueError:
            # handle the error case
            date = None
    print('{:23} --> {}'.format(line, date))

yields 产量

an_2011_02_12_azar.mp3  --> 2011-02-12 00:00:00
20121112_Marcel.mp3     --> 2012-11-12 00:00:00
cdani_270607.mp3        --> 2007-06-27 00:00:00
lica_07_03_15.mp3       --> 2015-03-07 00:00:00
an_2011_13_12_azar.mp3  --> None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python从字符串中提取不同格式的日期 - Python extract date with different formats from a string 如果字符串具有不同日期格式的日期,如何从字符串中提取日期 - How to extract date from a string if the string has date that in are in different date format 使用 python 从字符串格式时间戳中提取日期 - extract date from string format timestamp using python 从python中的字符串中提取日期 - Extract date from string in python 从具有“不同日期格式”的数据框中的日期列中提取年份-python - Extract year from date column in dataframe having 'different date format" - python 从字符串,正则表达式,python中提取特定格式 - Extract specific format from a string, regex, python 从特殊的字符串格式Python中提取变量 - Extract variable from special string format Python 如果日期字符串的格式不同,则将字符串转换为python中的日期 - Convert string to date in python if date string has different format 如何在python中拆分字符串(提取日期:日期结构不同) - How to split a string in python (extract the date : the date structure is different) 无法从 python 中的日期时间字符串中提取日期? - Unable to extract the date from the datetime string in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM