I am trying to filter the loglevel for some log files that I have. My issue is that Grok seems unable to handle \w
being at the start of the filter.
I am using this site to verify regex: https://regexr.com/
I am using this site to test the Grok filter: http://grokdebug.herokuapp.com/
Here is my example log line: 2020-04-07T13:08:19.261-0700|INFO |||
Here is what I am trying to run:
(?<timestamp>.+?(?=\|))(?<loglevel>\w+?(?= \|))
This says NO MATCHES found. If I replace the "\w" with "." it finds the line immediately, but leaves the non-alpha character:
{
"timestamp": [
[
"2020-04-07T13:08:19.261-0700"
]
],
"loglevel": [
[
"|INFO"
]
]
}
It should by all means work. Its just saying match \w characters. I am clearly lacking in regex knowledge here. Does anybody know what is going on and is willing to throw a few pointers my way?
Your regex does not match the string correctly because the (?=\|)
is a non-consuming pattern and the pattern fails to match a space after timestamp with \w
.
You may fix your current pattern with
(?<timestamp>.+?)\|(?<loglevel>\w+) \|
^^ ^^^
The fields are created with the named group captures anyway, so you need no lookarounds here.
Note you may actually use
%{TIMESTAMP_ISO8601:timestamp}\|%{LOGLEVEL:loglevel}
to parse your current input.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.