简体   繁体   中英

Python regular expression find and output a part of the pattern in multiple times

Assume I have a string as follows:

2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00

Where a date comes with time several times. Is it possible that regular expression can find all time after each date such as follows?

[('2021/12/23', '13:00','14:00'), ('2021/12/24', '13:00','14:00','15:00')]

I tried the following code in Python, but it returns only the first time:

re.findall(r'(\d+/\d+/\d+)(\s\d+\:\d+)+','2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00')

>>>[('2021/12/23', ' 14:00'), ('2021/12/24', ' 15:00')]

Use re.findall :

inp = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
matches = re.findall(r'\d{4}/\d{2}/\d{2}(?: \d{1,2}:\d{2})*', inp)
print(matches)

This prints:

['2021/12/23 13:00 14:00', '2021/12/24 13:00 14:00 15:00']

Explanation of regex:

\d{4}/\d{2}/\d{2}    match a date in YYYY/MM/DD format
(?: \d{1,2}:\d{2})*  match a space followed by hh:mm time, 0 or more times

You can use this findall + split solution:

import re

s = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'

for i in re.findall(r'\d+/\d+/\d+(?:\s\d+\:\d+)+', s): print (i.split())

Output:

['2021/12/23', '13:00', '14:00']
['2021/12/24', '13:00', '14:00', '15:00']

Code Demo

\d+/\d+/\d+(?:\s\d+\:\d+)+ matches a date string followed by 1 or more time strings.

You. could also use:

print ([i.split() for i in re.findall(r'\d+/\d+/\d+(?:\s\d+\:\d+)+', s)])

To get output:

[['2021/12/23', '13:00', '14:00'], ['2021/12/24', '13:00', '14:00', '15:00']]

You can use PyPi regex library to get the following to work:

import regex
pattern = regex.compile(r'(?P<date>\d+/\d+/\d+)(?:\s+(?P<time>\d+:\d+))+')
for m in pattern.finditer('2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'):
    print(m.capturesdict())

Output:

{'date': ['2021/12/23'], 'time': ['13:00', '14:00']}
{'date': ['2021/12/24'], 'time': ['13:00', '14:00', '15:00']}

See the Python demo .

Since PyPi regex library does not "forget" all captures inside a group, and provided the groups are named, the match.capturesdict() returns the dictionary of all groups with their captures.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM