正则表达式模式以匹配python中的日期时间

Question

I have a string contains datetimes, I am trying to split the string based on the datetime occurances, 我有一个包含日期时间的字符串，我正在尝试根据日期时间出现次数拆分该字符串，

data="2018-03-14 06:08:18, he went on \n2018-03-15 06:08:18, lets play"

what I am doing, 我在做什么，

out=re.split('^(2[0-3]|[01]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$',data)

what I get 我得到什么

["2018-03-14 06:08:18, he went on 2018-03-15 06:08:18, lets play"]

What I want: 我想要的是：

["2018-03-14 06:08:18, he went on","2018-03-15 06:08:18, lets play"]

Answer 1

You want to split with at least 1 whitespace followed with a date like pattern, thus, you may use 您希望使用至少1个空格和后跟日期（如pattern）进行拆分，因此，您可以使用

re.split(r'\s+(?=\d{2}(?:\d{2})?-\d{1,2}-\d{1,2}\b)', s)

See the regex demo 见正则表达式演示

Details 细节

\\s+ - 1+ whitespace chars \\s+ -1+空格字符
(?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b) - a positive lookahead that makes sure, that immediately to the left of the current location, there are (?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b) -确定正向的正向当前位置的
- \\d{2}(?:\\d{2})? - 2 or 4 digits -2位或4位数字
- - - a hyphen -连字符
- \\d{1,2} - 1 or 2 digits \\d{1,2} -1或2位数字
- -\\d{1,2} - again a hyphen and 1 or 2 digits -\\d{1,2} -连字符和1或2位数字
- \\b - a word boundary (if not necessary, remove it, or replace with (?!\\d) in case you may have dates glued to letters or other text) \\b单词边界（如果没有必要，请将其删除，或将其替换为(?!\\d) ，以防可能将日期粘贴到字母或其他文本上）

Python demo : Python演示：

import re
rex = r"\s+(?=\d{2}(?:\d{2})?-\d{1,2}-\d{1,2}\b)"
s = "2018-03-14 06:08:18, he went on 2018-03-15 06:08:18, lets play"
print(re.split(rex, s))
# => ['2018-03-14 06:08:18, he went on', '2018-03-15 06:08:18, lets play']

NOTE If there can be no whitespace before the date, in Python 3.7 and newer you may use r"\\s*(?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b)" (note the * quantifier with \\s* that will allow zero-length matches). 注意如果日期前没有空格，则在Python 3.7及更高版本中，您可以使用r"\\s*(?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b)" （请注意*带有\\s*量词，它将允许零长度匹配）。 For older versions, you will need to use a solution as @blhsing suggests or install PyPi regex module and use r"(?V1)\\s*(?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b)" with regex.split . 对于较旧的版本，您将需要使用@blhsing建议的解决方案或安装PyPi regex模块并使用r"(?V1)\\s*(?=\\d{2}(?:\\d{2})?-\\d{1,2}-\\d{1,2}\\b)"和regex.split 。

Answer 2

re.split is meant for cases where you have a certain delimiter pattern. re.split用于具有特定定界符模式的情况。 Use re.findall with a lookahead pattern instead: 使用re.findall模式的re.findall代替：

import re
data="2018-03-14 06:08:18, he went on \n2018-03-15 06:08:18, lets play"
d = r'\d{4}-\d?\d-\d?\d (?:2[0-3]|[01]?[0-9]):[0-5]?[0-9]:[0-5]?[0-9]'
print(re.findall(r'{0}.*?(?=\s*{0}|$)'.format(d), data, re.DOTALL))

This outputs: 输出：

['2018-03-14 06:08:18, he went on', '2018-03-15 06:08:18, lets play']

Answer 3

An similar, but alternative solution using a group instead: 使用组的类似但替代的解决方案：

import re

data="2018-03-14 06:08:18, he went on 2018-03-15 06:08:18, lets play"

print(re.findall(r'(.*?\D{2,})', data))

Which gives: 这使：

['2018-03-14 06:08:18, he went on ', '2018-03-15 06:08:18, lets play']

正则表达式模式以匹配python中的日期时间

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-07-18 07:30:20

解决方案2
2 2018-07-18 07:18:10

解决方案3
1 2018-07-18 07:27:16

正则表达式模式以匹配python中的日期时间

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-07-18 07:30:20

解决方案2 2 2018-07-18 07:18:10

解决方案3 1 2018-07-18 07:27:16

解决方案1
3 已采纳 2018-07-18 07:30:20

解决方案2
2 2018-07-18 07:18:10

解决方案3
1 2018-07-18 07:27:16