[英]Python regex to match string excluding word
I have an issue on building a regex and I've searched for 2 days all around Google, Stack Overflow and other documentations... 我有关于构建正则表达式的问题,我在Google,Stack Overflow和其他文档中搜索了2天...
I have the following lines: 我有以下几行:
2015-07-08 12:49:07.183852|INFO |VirtualServerBase| 3| client disconnected 'Ròem'(id:6336) reason 'invokerid=20 invokername=Alphonse invokeruid=loremipsum2= reasonmsg=test'
2015-07-08 11:59:23.178055|INFO |VirtualServerBase| 3| client disconnected 'Trakiyen'(id:20460) reason 'invokerid=0 invokername=server reasonmsg=idle time exceeded'
2015-07-08 12:40:50.591450|INFO |VirtualServerBase| 3| client disconnected 'kalash'(id:20464) reason 'invokerid=136 invokername=Charles invokeruid=loremipsum= reasonmsg=Aller, Bisous! bantime=0
2015-07-08 00:23:03.235312|INFO |VirtualServerBase| 3| client disconnected 'Brigata FTW'(id:20451) reason 'invokerid=103 invokername=Bob invokeruid=loremipsum3= reasonmsg=En vous souhaitant une bonne soirée <3 bantime=28800'
I want to match only the first line, following those conditions: 我想只匹配第一行,遵循这些条件:
invokername=server
没有invokername=server
bantime
没有线与bantime
In that case the result should only match the first line with the following regex: 在这种情况下,结果应该只匹配第一行与以下正则表达式:
.*2015-07-08.*client disconnected.*invokername=[^server].*[^bantime=].*
I only write here one regex but I've tried many and many differents things (with ?!
, etc). 我只在这里写了一个正则表达式,但我尝试了许多不同的东西(用?!
等)。 I've read a lot topics about excluding on Stack Overflow but could not find a solution. 我已经阅读了很多关于在Stack Overflow上排除但无法找到解决方案的主题。 I hope someone will help me. 我希望有人会帮助我。
You can get your line with 你可以得到你的路线
(?m)^(?!.*\b(?:invokername=server|bantime)\b).*2015-07-08.*client disconnected.*invokername=.*$
EXPLANATION : 解释 :
(?m)
- A multiline flag so that ^
and $
could match at the start and end of the sentence. (?m)
- 多行标志,以便^
和$
可以在句子的开头和结尾匹配。 ^
- Start of line anchor ^
- 线锚的开始 (?!.*\\b(?:invokername=server|bantime)\\b)
- A negative look-ahead that is making sure there is no whole words invokername=server
or bantime
further on the line (?!.*\\b(?:invokername=server|bantime)\\b)
- 一个负面的(?!.*\\b(?:invokername=server|bantime)\\b)
,确保没有整个单词invokername=server
或bantime
进一步上线 .*2015-07-08.*client disconnected.*invokername=.*
- substring containing 2015-07-08
, client disconnected
, invokername=
and anything can be in-between those substrings (but a linebreak). .*2015-07-08.*client disconnected.*invokername=.*
- 包含2015-07-08
子字符串, client disconnected
, invokername=
并且任何内容都可以在这些子字符串之间(但是换行符)。 $
- End of line $
- 行尾 Alternatively, you can just match *any line that has no disallowed substrings: 或者,您可以匹配*任何没有不允许的子串的行:
(?m)^(?!.*\b(?:invokername=server|bantime)\b).*$
This is a much better alternative if it does not "overmatch" for you. 如果它没有“超匹配”,这是一个更好的选择。
You seem to confuse [^...]
with (?!...)
. 你似乎把[^...]
与(?!...)
混淆了。 The former is a negated character class group, while the latter is a negative lookahead. 前者是一个否定的角色类群,而后者是一个负面的先行者。
If we now also keep in mind that negative lookahead is applied at the current position, we need: 如果我们现在还要记住在当前位置应用负前瞻,我们需要:
.*?2015-07-08.*?client disconnected.*?(invokername=(?!server))((?!.*?bantime=).*)
Edit: Credit where credit is due: @stribizhev's solution is better than mine: 编辑:信用到期的信用:@ stribizhev的解决方案比我的更好:
(?m)^(?!.*\b(?:invokername=server|bantime)\b).*$
Alongside the @llogiq's answer which explained the difference between negated character class and negative look-ahead ,you can also use only following regex using negative look ahead : 除了@ llogiq的答案解释了否定字符类和负面预测之间的区别之外,您还可以使用以下正则表达式使用负面预测:
^((?!bantime|(?:invokername=server)).)*$
See demo https://regex101.com/r/hI5dR0/1 请参阅演示https://regex101.com/r/hI5dR0/1
>>> re.search(r'^((?!bantime|(invokername=server)).)*$',s,re.M).group()
"015-07-08 12:49:07.183852|INFO |VirtualServerBase| 3| client disconnected 'R\xc3\xb2em'(id:6336) reason 'invokerid=20 invokername=Alphonse invokeruid=loremipsum2= reasonmsg=test'"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.