简体   繁体   English

python 正则表达式:行尾的反向匹配

[英]python regex: inverse match at the end of the line

I am using regex to match patterns in my log.我正在使用正则表达式来匹配日志中的模式。 I need to match pattern at the beginning, but then invert the match, ie:我需要在开始时匹配模式,然后反转匹配,即:

I need to match this line:我需要匹配这一行:

reject: RCPT from unknown[165.231.143.153]: 450 4.7.25 from=<spameri@tiscali.it> to=<spameri@tiscali.it>

But not this line:但不是这一行:

reject: RCPT from unknown[165.231.143.153]: 450 4.7.25 from=<spameri@tiscali.it> to=<alice@mydomain.com>

Basically, if the line contains to=<alice@mydomain.com> (or any other email address with mydomain.com , then it should not trigger a match. Otherwise if it is anything else, ie to=<bob@otherdomain.com> , or to=<alice@thirddomain.com> then it should match.基本上,如果该行包含to=<alice@mydomain.com> (或任何其他带有mydomain.com的 email 地址,则它不应触发匹配。否则,如果它是其他内容,即to=<bob@otherdomain.com>to=<alice@thirddomain.com>那么它应该匹配。

I tried using this negative look ahead pattern:我尝试使用这种消极的前瞻模式:

'^reject: RCPT from [A-Za-z0-9\.-]+\[{ip}\]: .* to=<[A-Za-z0-9\._-]+@(?!mydomain.com)>',

where I am negating mydomain.com using the construct the construct (?.mydomain.com)我在哪里否定mydomain.com使用构造构造(?.mydomain.com)

How can I do that?我怎样才能做到这一点?

Lookaheads are non-consuming, ie the regex index remains where it was and the patterns matched are not added to the overal match value.前瞻是非消耗性的,即正则表达式索引保持在原来的位置,并且匹配的模式不会添加到总体匹配值中。

Thus, (?.mydomain.com) in (?.mydomain.com)> checks if there is no mydomain , any char, com immediately to the right of the current location, and as the next char must be > , it is always true.因此, (?.mydomain.com) in (?.mydomain.com)>检查是否没有mydomain ,任何字符com紧邻当前位置的右侧,并且由于下一个字符必须是> ,它总是真的。

You need to consume the char before > and thus you can use您需要在>之前使用 char,因此您可以使用

^reject: RCPT from [A-Za-z0-9.-]+\[{ip}]: .* to=<[A-Za-z0-9._-]+@(?!mydomain\.com>)[^>]*>

Note you do not need to escape .注意你不需要转义. inside square brackets (aka character class) and you do not need to escape ] when it is not inside a character class.在方括号(又名字符类)内,当它不在字符 class 内时,你不需要转义]

The @(?.mydomain\.com>)[^>]*> matches @(?.mydomain\.com>)[^>]*>匹配

  • @ - a @ char @ - 一个@字符
  • (?.mydomain\.com>) - not immediately followed with mydomain.com> (?.mydomain\.com>) - 不紧跟mydomain.com>
  • [^>]* - (a negated character class matching) any zero or more chars other than > [^>]* -(否定字符 class 匹配)除>以外的任何零个或多个字符
  • > - a > char. > - 一个>字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM