简体   繁体   中英

python regex: inverse match at the end of the line

I am using regex to match patterns in my log. I need to match pattern at the beginning, but then invert the match, ie:

I need to match this line:

reject: RCPT from unknown[165.231.143.153]: 450 4.7.25 from=<spameri@tiscali.it> to=<spameri@tiscali.it>

But not this line:

reject: RCPT from unknown[165.231.143.153]: 450 4.7.25 from=<spameri@tiscali.it> to=<alice@mydomain.com>

Basically, if the line contains to=<alice@mydomain.com> (or any other email address with mydomain.com , then it should not trigger a match. Otherwise if it is anything else, ie to=<bob@otherdomain.com> , or to=<alice@thirddomain.com> then it should match.

I tried using this negative look ahead pattern:

'^reject: RCPT from [A-Za-z0-9\.-]+\[{ip}\]: .* to=<[A-Za-z0-9\._-]+@(?!mydomain.com)>',

where I am negating mydomain.com using the construct the construct (?.mydomain.com)

How can I do that?

Lookaheads are non-consuming, ie the regex index remains where it was and the patterns matched are not added to the overal match value.

Thus, (?.mydomain.com) in (?.mydomain.com)> checks if there is no mydomain , any char, com immediately to the right of the current location, and as the next char must be > , it is always true.

You need to consume the char before > and thus you can use

^reject: RCPT from [A-Za-z0-9.-]+\[{ip}]: .* to=<[A-Za-z0-9._-]+@(?!mydomain\.com>)[^>]*>

Note you do not need to escape . inside square brackets (aka character class) and you do not need to escape ] when it is not inside a character class.

The @(?.mydomain\.com>)[^>]*> matches

  • @ - a @ char
  • (?.mydomain\.com>) - not immediately followed with mydomain.com>
  • [^>]* - (a negated character class matching) any zero or more chars other than >
  • > - a > char.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM