简体   繁体   中英

RegEx Filter Works In RegExr But Not Logstash Grok

I am trying to filter the loglevel for some log files that I have. My issue is that Grok seems unable to handle \w being at the start of the filter.

I am using this site to verify regex: https://regexr.com/

I am using this site to test the Grok filter: http://grokdebug.herokuapp.com/

Here is my example log line: 2020-04-07T13:08:19.261-0700|INFO |||

Here is what I am trying to run:

(?<timestamp>.+?(?=\|))(?<loglevel>\w+?(?= \|))

This says NO MATCHES found. If I replace the "\w" with "." it finds the line immediately, but leaves the non-alpha character:

{
  "timestamp": [
    [
      "2020-04-07T13:08:19.261-0700"
    ]
  ],
  "loglevel": [
    [
      "|INFO"
    ]
  ]
}

It should by all means work. Its just saying match \w characters. I am clearly lacking in regex knowledge here. Does anybody know what is going on and is willing to throw a few pointers my way?

Your regex does not match the string correctly because the (?=\|) is a non-consuming pattern and the pattern fails to match a space after timestamp with \w .

You may fix your current pattern with

(?<timestamp>.+?)\|(?<loglevel>\w+) \|
                 ^^                ^^^

The fields are created with the named group captures anyway, so you need no lookarounds here.

Note you may actually use

%{TIMESTAMP_ISO8601:timestamp}\|%{LOGLEVEL:loglevel}

to parse your current input.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM