简体   繁体   English

python查找日期中的正则表达式

[英]Regular expression in python finding dates

I am using regular expressions in python to finds dates like 09/2010 or 8/1976 but not 11/12/2010. 我在python中使用正则表达式来查找日期,例如09/2010或8/1976,但未找到11/12/2010。 I am using the following lines of codes but it does not work in some cases. 我正在使用以下代码行,但在某些情况下无法使用。

r'([^/](0?[1-9]|1[012])/(\d{4}))'
import re

rgx = "(?:\d{1,2}\/)?\d{1,2}\/\d{2}(?:\d{2})?"
dates = "09/2010, 8/1976, 11/12/2010, 09/06/15 .."

result = re.findall(rgx, dates)
print(result)
# ['09/2010', '8/1976', '11/12/2010', '09/06/15']

This, a little bit explicit code, uses re.sub and datetime.strptime to parse/validate the input string: 这是一些明确的代码,使用re.subdatetime.strptime来解析/验证输入字符串:

import re
import datetime

s = '09/2010, 8/1976, 11/8/2010, 09/06/15, 12/1987, 13/2011, 09/13/2001'

r = re.compile(r'\b(\d{1,2})/(?:(\d{1,2})/)?(\d{2,4})\b')

def validate_date(g, parsed_values):
    if not g.group(2) is None:
        s = '{:02d}/{:02d}/{:04d}'.format(*map(int, g.groups()))
    else:
        s = '01/{:02d}/{:04d}'.format(int(g.group(1)), int(g.group(3)))

    try:
        datetime.datetime.strptime(s, '%d/%m/%Y')
        parsed_values.append(g.group())
        return
    except:
        pass

parsed_values = []
r.sub(lambda g: validate_date(g, parsed_values), s)

print(parsed_values)

Prints: 印刷品:

['09/2010', '8/1976', '11/8/2010', '09/06/15', '12/1987']

EDIT: Shortened the code. 编辑:缩短了代码。

After working on this problem I came to this solution: 解决此问题后,我得出了以下解决方案:

This works very well! 这很好用!

df['text'].str.extractall(r'(?P<Date>(?P<month>\d{1,2})/?(?P<day>\d{1,2})?/(?P<year>\d{2,4}))')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM