簡體   English   中英

python中模式匹配時間格式的正則表達式

[英]Regular expressions for pattern matching time format in python

我希望使用 python 中的正則表達式來匹配以下時間格式,並在找到/未在一行中找到匹配項時標記 True 或 False。 示例文本如下。 如何僅使用正則表達式來完成此任務?

  • 凌晨 2 點至晚上 8 點
  • 凌晨 2:00 - 晚上 8:00
  • 08:00 AM - 05:00 PM
  • 上午 5 點 30 分至晚上 8 點 59 分

可以觀察到在每種符號中都一致的“_am - _pm”和“_am-_pm”模式。帶有空格匹配的冒號和數字格式是我一直在嘗試做的。 下面是我從這里找到的

HH:MM 12-hour format, optional leading 0, mandatory meridiems (AM/PM)
/((1[0-2]|0?[1-9]):([0-5][0-9]) ?([AaPp][Mm]))/

示例文本:

Lorem Ipsum is dummy text of the printing and typesetting industry between 2am-8pm. 
Contrary to popular belief, Lorem Ipsum is not simply random text. : False
Lorem has been the industry between 2:00am - 8:00pm standard dummy text since the 1500s. 
It has survived not only five centuries, but also between 08:00am-05:00pm 
It was popularised from 5:30am - 8:59pm with the release of Letraset sheets. 
More recently with desktop publishing software like Aldus PageMaker 983-765-0976. 

所需的 output:

Lorem Ipsum is dummy text of the printing and typesetting industry between 2am-8pm. : True
Contrary to popular belief, Lorem Ipsum is not simply random text. : False
Lorem has been the industry between 2:00am - 8:00pm standard dummy text since the 1500s. : True
It has survived not only five centuries, but also between 08:00am-05:00pm : True
It was popularised from 5:30am - 8:59pm with the release of Letraset sheets. : True
More recently with desktop publishing software like Aldus PageMaker 983-765-0976. : False

您可以使用

(?i)(?<!\d)(?:1[0-2]|0?[1-9])(?::(?:[0-5][0-9]))?\s?[ap]m\s*-\s*(?:1[0-2]|0?[1-9])(?::(?:[0-5][0-9]))?\s?[ap]m\b

查看正則表達式演示

細節

  • (?i) - 不區分大小寫模式
  • (?<!\d) - 之前不允許有數字
  • (?:1[0-2]|0?[1-9])(?::(?:[0-5][0-9]))? - 時間模式:
    • (?:1[0-2]|0?[1-9]) - 012 ,在1-9位前可選前導0
    • (?::(?:[0-5][0-9]))? - 一個可選的 minut 序列,帶有:分隔符
  • \s? - 一個可選的空格
  • [ap]m - ap然后m
  • \s*-\s* - 用 0+ 個空格括起來的連字符
  • (?:1[0-2]|0?[1-9])(?::(?:[0-5][0-9]))?\s?[ap]m - 同一時間模式如上
  • \b - 單詞邊界。

Python 演示

import re
time = r'(?:1[0-2]|0?[1-9])(?::(?:[0-5][0-9]))?\s?[ap]m'
pattern = re.compile(r'(?i)(?<!\d){0}\s*-\s*{0}\b'.format(time))
texts = ['Lorem Ipsum is dummy text of the printing and typesetting industry between 2am-8pm.',
'Contrary to popular belief, Lorem Ipsum is not simply random text.',
'Lorem has been the industry between 2:00am - 8:00pm standard dummy text since the 1500s.',
'It has survived not only five centuries, but also between 08:00am-05:00pm',
'It was popularised from 5:30am - 8:59pm with the release of Letraset sheets.',
'More recently with desktop publishing software like Aldus PageMaker 983-765-0976.']
for text in texts:
    print (text, bool(pattern.search(text)), sep=" : ")

Output:

Lorem Ipsum is dummy text of the printing and typesetting industry between 2am-8pm. : True
Contrary to popular belief, Lorem Ipsum is not simply random text. : False
Lorem has been the industry between 2:00am - 8:00pm standard dummy text since the 1500s. : True
It has survived not only five centuries, but also between 08:00am-05:00pm : True
It was popularised from 5:30am - 8:59pm with the release of Letraset sheets. : True
More recently with desktop publishing software like Aldus PageMaker 983-765-0976. : False

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM