简体   繁体   English

AD FS 通过正则表达式记录多个 ip 提取

[英]AD FS log multiple ip extraction through Regex

Stuck on pcre regex question.卡在 pcre 正则表达式问题上。 I am trying to extract all ips following a field ("Client IP: ") in a AD FS log.我正在尝试提取 AD FS 日志中某个字段(“客户端 IP:”)之后的所有 ip。
My log looks like this (truncated to save space):我的日志看起来像这样(截断以节省空间):

EventCode=411
EventType=0
Type=Information
SidType=1
TaskCategory=Printers
OpCode=Info
Token Type: 
http://schemas.microsoft.com/ws/2006/05/identitymodel/tokens/UserName  

Client IP: 
110.19.100.155,2603:1032:205:14::5 

Error message: 
******-This user can't sign in because this account is currently disabled 

So the end result desired is that I get both ip addresses under the field src_ip, and that it only tries the regex if it finds the EventCode=411 or 512, etc...因此,所需的最终结果是我在 src_ip 字段下获得了两个 ip 地址,并且它仅在找到 EventCode=411 或 512 等时才尝试使用正则表达式...

What I have so far is this:到目前为止,我所拥有的是:

(\s\n|,)(?<src_ip>(?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))

This works but does not differentiate for events with only the required Event Codes.这有效但不区分仅具有所需事件代码的事件。 So when I do this:所以当我这样做时:

(?ms)(?:EventCode=(411|512))\n.*?(\s\n|,)(?P<src_ip>(?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))

It only picks up the first ip.它只获取第一个 ip。

Any ideas?有任何想法吗?

You may slightly modify your pattern by adding a custom boundary based on the \\G operator that matches the start of a string or, what you need here, the end of the previous successful match:您可以通过添加一个基于\\G运算符的自定义边界来稍微修改您的模式,该边界匹配字符串的开头,或者您在这里需要的,前一个成功匹配的结尾:

(?ms)(?:\G(?!\A)\s*,\s*|EventCode=(411|512)\n.*?\R)\K(?P<src_ip>(?:\d{1,3}\.){3}(?:\d{1,3})|(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}[\d%A-Fa-f.]*(?:::)?|::[\dA-Fa-f.]{1,15}|::)

See the regex demo .请参阅正则表达式演示

Basically, the main difference is (?:\\G(?!\\A)\\s*,\\s*|EventCode=(411|512)\\n.*?\\R)\\K :基本上,主要区别在于(?:\\G(?!\\A)\\s*,\\s*|EventCode=(411|512)\\n.*?\\R)\\K

  • \\G(?!\\A)\\s*,\\s* - the end of the preceding successful match (the start of string position has been subtracted with the negative lookahead (?!\\A) ), then a comma enclosed with 0+ whitespaces \\G(?!\\A)\\s*,\\s* - 前一个成功匹配的结尾(字符串位置的开始已经被负前瞻(?!\\A)减去),然后是一个用 0 括起来的逗号+ 空格
  • | - or - 或者
  • EventCode=(411|512)\\n.*?\\R - EventCode= substring, then (411|512) captures 411 or 512 into Group 1, then \\R matches a line break and .*?\\R matches any amount of 0+ chars as few as possible up to another line break that will be followed with subsequent subpatterns) EventCode=(411|512)\\n.*?\\R - EventCode=子字符串,然后(411|512)411512捕获到第 1 组中,然后\\R匹配换行符并且.*?\\R匹配任意数量的0+ 字符尽可能少,直到另一个换行符,随后的子模式将跟随)
  • \\K - match reset operator discarding all text matched so far from the whole match buffer. \\K - 匹配重置操作符丢弃到目前为止从整个匹配缓冲区匹配的所有文本。

You also had a slight issue: [\\d\\%A-Fa-z\\.] should be written as [\\d\\%A-Fa-f.] .你也有一个小问题: [\\d\\%A-Fa-z\\.]应该写成[\\d\\%A-Fa-f.]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM