简体   繁体   English

正则表达式模式不适用于“或”

[英]Regex pattern does not work with 'or'

I need to parse DHCP log data which is like below: 我需要解析DHCP日志数据,如下所示:

2013-11-15 09:42:02 localhost dhcpd: DHCPACK on 10.51.1.242 to 00:1e:8c:21:83:a0 (Hostname Unsuitable for Printing) via eth2

I wrote a regex pattern to gather all matched values and it like this: 我编写了一个正则表达式模式来收集所有匹配的值,如下所示:

(?P<date>[\d{2,4}-]*[\d{2}:\s]*)\s(?P<host>\S+)\s*(?P<facility>\s*\S*:)\s*((?P<action>DHCP*\S*)\s*|(?P<mac>([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]){2})\s*|(?P<message>\S*)\s*|(\s*))*

After re.search(regex, text).groupdict() command it gives me this dict: re.search(regex, text).groupdict()命令之后,它给了我这个字典:

{u'facility': u'dhcpd:', u'host': u'localhost', u'date': u'2013-11-15 09:42:02', u'mac': u'00:1e:8c:21:83:a0', u'action': u'DHCPACK', u'message': u''}

As it is seen that every single item returns me correct match but message part which placed in parentheses and I tried with too many variations to get it. 可以看出,每个项目都给我正确的匹配,但是消息部分放在括号中,我尝试了太多的变化来获得它。 (?P<message>\\((.*)\\)) pattern works fine and returns {u'message': u'(Hostname Unsuitable for Printing)'} if I use it as single otherwise It does not match at all. (?P<message>\\((.*)\\))模式可以正常工作并返回{u'message': u'(Hostname Unsuitable for Printing)'}如果我单独使用),否则根本不匹配。

I stuck with this and really need help. 我坚持这一点,真的需要帮助。

I'm not sure why you're using so many | 我不确定您为什么使用那么多| operands. 操作数。 I stripped them out and used \\s+ as delimiters and $ to match the end of string as a delimiter for the message but this works for me: 我删除了它们,并使用\\s+作为分隔符,并使用$匹配字符串的结尾作为消息的分隔符,但这对我有用:

import re
text = r'2013-11-15 09:42:02 localhost dhcpd: DHCPACK on 10.51.1.242 to 00:1e:8c:21:83:a0 (Hostname Unsuitable for Printing) via eth2'
my_regexp = r'^(?P<date>[\d{2,4}-]*[\d{2}:\s]*)\s+(?P<host>\S+)\s+(?P<facility>\s*\S*):(\s+(?P<action>DHCP*\S*).+(?P<mac>([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]){2})\s+(?P<message>.*))*$'
print re.search( my_regexp, text).groupdict()

Output: 输出:

{'facility': 'dhcpd', 'host': 'localhost', 'date': '2013-11-15 09:42:02', 'mac': '00:1e:8c:21:83:a0', 'action': 'DHCPACK', 'message': '(Hostname Unsuitable for Printing) via eth2'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM