简体   繁体   English

与Python中的IEEE时间戳格式匹配的最佳正则表达式

[英]Best regex to match IEEE Time Stamp format in Python

I did some searching but didn't find this specifically, and I'm sure it's going to be a quick answer. 我做了一些搜索,但没有找到具体的答案,我敢肯定这将是一个快速的答案。

I have a python script parsing IEEE date and time stamps out of strings, but I think I'm using python's match objects wrong. 我有一个从字符串中解析出IEEE日期和时间戳的python脚本,但是我认为我使用python的match对象是错误的。

import re
stir = "foo_2015-07-07-17-58-26.log"
timestamp = re.search("([0-9]+-){5}[0-9]+", stir).groups()
print timestamp

Produces 产生

58-

When my intent is to get 当我的目的是

2015-07-07-17-58-26

Is there a pre-canned regex that would work better here? 是否有预罐装的正则表达式在这里会更好? Am I getting tripped up on re's capture groups? 我被绊倒了吗? Why is the length of the groups() tuple only 1? 为什么groups()元组的长度只有1?

Edit 编辑

I was misinterpreting the way capture groups work in python's re module - there is only one set of parentheses in the statement, so the re module returned the most recently grabbed capture group - the "58-". 我误解了捕获组在python的re模块中的工作方式-语句中只有一组括号,因此re模块返回了最近捕获的捕获组-“ 58-”。

The way I ended up doing it was by referencing group(0), as Dawg suggests below. 我最终做到这一点的方式是通过引用group(0),正如Dawg在下面建议的那样。

timestamp = re.search("([0-9]+-){5}[0-9]+", stir)

print timestamp.group(0)
2015-07-07-17-58-26

You need a single capture group or groups: 您需要一个或多个捕获组:

(\d\d\d\d-\d\d-\d\d-\d\d-\d\d-\d\d)

Demo 演示

Or, use nested capture groups: 或者,使用嵌套捕获组:

>>> re.search(r'(\d{4}(?:-\d{2}){5})', 'foo_2015-07-07-17-58-26.log')
<_sre.SRE_Match object at 0x100b49dc8>
>>> _.group(1)
'2015-07-07-17-58-26'

Or, you can use your pattern and just use group(0) instead of groups() : 或者,您可以使用模式,而仅使用group(0)而不是groups()

>>> re.search("([0-9]+-){5}[0-9]+", "foo_2015-07-07-17-58-26.log").group(0)
'2015-07-07-17-58-26'

Or, use findall with an additional capture group (and the other a non capture group): 或者,将findall与其他捕获组(以及另一个非捕获组)一起使用:

>>> re.findall("((?:[0-9]+-){5}[0-9]+)", 'foo_2015-07-07-17-58-26.log')
['2015-07-07-17-58-26']

But that will find the digits that are not part of the timestamp. 但这将找到不属于时间戳的数字。

if you want the timestamp in one match object, i think this should work 如果您想在一个匹配对象中添加时间戳,我认为这应该可行

 \\d{4}(?:\\d{2}){5} 

then use group() or group(0) 然后使用group()或group(0)

also, match.groups actually returns the number of group objects, you should try .group() instead (your code would still not work though because you grouped the 5 sets of numbers in and the final -58 would be omitted 同样,match.groups实际上返回组对象的数量,您应该尝试使用.group()(尽管您的代码仍然无法使用,因为您将5组数字分组在一起,最后的-58将被省略

I'd use below: 我会在下面使用:

_(\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}).

_ and . _和。 to mark the starting and the end. 标记开始和结束。

import re
r = r'_(\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}).'
s = 'some string'
lst = re.findall(s,r)

link 链接

You might want 你可能想要

re.findall(r"([0-9-]+)", stir)


>>> import re
>>> stir = "foo_2015-07-07-17-58-26.log"
>>> re.findall(r"([0-9-]+)", stir)
['2015-07-07-17-58-26']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM