简体   繁体   中英

Regex to match a non numeric value or end of string in Python

I'm trying to match date formats with regex. An example date of each is:

02 Apr 15
02 Apr 2015

The regex I'm using to match the first one is:

re.compile("([0-9]{2}) ([A-Z][a-z]{2}) ([0-9]{2})")

And for the second:

re.compile("([0-9]{2}) ([A-Z][a-z]{2}) ([0-9]{4})")

Now the issue I'm having is that the second date will match the first regex, even though it contains 4 digits rather than just 2. I wanted to add an end of line to the regex, but sometimes there is the time appended to it (Ie 4:32). So what I want to do is have the first regex match the corresponding date with the possibility of having nothing after it or a space+stuff after it. So the first one should match:

"02 Apr 15"
"02 Apr 15 5:23"

but not match:

"02 Apr 2015"
"02 Apr 2015 5:23"

It should be flopped for the other regex. So, pretty much, the only values that are important are the first 3 (dd Mmm YY and dd Mmm YYYY).

What you're looking for is word boundary ie:

re.compile("\\b([0-9]{2}) ([A-Z][a-z]{2}) ([0-9]{2})\\b")

This will make sure 4 digit year is not matched while trying to matching first date in your examples.

However you should consider Python date parse routine

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM