简体   繁体   English

使用python正则表达式匹配时间

[英]Using python regular expression to match times

I'm trying to parse a csv file with times in the form of 6:30pm or 7am, or midnight.我正在尝试以下午 6:30 或早上 7 点或午夜的形式解析 csv 文件。 I've googled around and read the docs for regular expressions in the python docs but haven't been able to implement them successfully.我已经搜索并阅读了 python 文档中正则表达式的文档,但未能成功实现它们。

My first try to match them was:我第一次尝试匹配它们是:

re.findall(r'^d{1,2}(:d{1,2})?$', string)

But this didn't work.但这没有用。 I have the parenthesis and the question mark there because sometimes there isn't always anything more than the hour.我在那里有括号和问号,因为有时并不总是超过小时。 Also, I haven't even begun to think about how to match the am and pm.另外,我还没有开始考虑如何匹配上午和下午。 Any help is appreciated!任何帮助表示赞赏!

First of all, to match digits you need \\d , not just d .首先,要匹配数字,您需要\\d ,而不仅仅是d

re.findall(r'^\d{1,2}(:\d{1,2})?$', string)

Second, as written, your regex will only match a string which is exactly a single time and nothing else, because ^ means "beginning of string" and $ means "end of string. You can omit those if you want to find all of the times throughout the string:其次,正如所写的那样,您的正则表达式只会匹配一个字符串,它只匹配一次,而没有其他任何东西,因为^表示“字符串的开头”,而$表示“字符串的结尾。如果您想找到所有的字符串,您可以省略它们整个字符串的时间:

re.findall(r'\d{1,2}(:\d{1,2})?', string)

As far as the am/pm goes, you can just add another optional group:就 am/pm 而言,您可以添加另一个可选组:

re.findall(r'\d{1,2}(:\d{1,2})?(am|pm)?', string)

Of course, because everything is optional besides the first 1 or 2 digits, you're also going to match any one or two digit number.当然,因为除了前 1 位或 2 位数字之外,所有内容都是可选的,因此您还将匹配任何一位或两位数字。 You could instead require either at least either am/pm or a colon and two more digits:您可以改为至少需要 am/pm 或冒号和另外两个数字:

re.findall(r'\d{1,2}((am|pm)|(:\d{1,2})(am|pm)?)', string)

But, findall behaves slightly oddly: if you have matching groups in your pattern, it'll only return the groups rather than the full match.但是, findall 的行为有点奇怪:如果你的模式中有匹配的组,它只会返回组而不是完整的匹配。 Thus, you can change them to non-matching groups:因此,您可以将它们更改为不匹配的组:

re.findall(r'\d{1,2}(?:(?:am|pm)|(?::\d{1,2})(?:am|pm)?)', string)

If you are strictly looking for a regex solution.如果您正在严格寻找正则表达式解决方案。 You can use:您可以使用:

re.findall(r'^\d{1,2}(:\d{1,2})?$', string)

But wait但是等等

that's not all.这还不是全部。 There is a better way to do it without regex ;).没有正则表达式有更好的方法;)。 You can use python CSV parsing powers.您可以使用 python CSV解析功能。

import csv
string = "November,Monday,6:30pm,1989"
csv_reader = csv.reader( [ string ] )
for row in csv_reader:
    print row 

Output输出

['November', 'Monday', '6:30pm', '1989']
import re 
regex = r'(\d{1,2})([.:](\d{1,2}))?[ ]?(am|pm)?' 
groups = re.findall(regex, value)

group1 will give hr group1 会给 hr
group3 will give min group3 会给分钟
group4 will give am/pm group4 将给 am/pm

Examples :例子:
12pm中午12点
12.30pm 12.30pm
12:30pm中午 12:30
2.30 am 2.30 上午
all these examples are working所有这些例子都在起作用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM