I am trying to configure logstash to manage my various log sources, one of which is Mongrel2. The format used by Mongrel2 is tnetstring
, where a log message will take the form
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
I want to write my own grok patterns to extract certain fields from the above format. I started by testing my regex on the above message here , the regex is
^(?:[^:]*\:){2}([^,]*)
this matches localhost
. When I use the same regex as a grok pattern in the form
TEST ^(?:[^:]*\:){2}([^,]*)
MONGREL %{TEST:test}
and configure logstash with
filter {
grok {
match => [ "message", "%{MONGREL}" ]
}
}
the same regex results in the match 86:9:localhost
. I can't figure out where I am going wrong? Is is that the regex engine I was using to test is based on Python but the grok filter regex is based on Onigurama?
Currently testing it in grokdebug with the following input
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
and the following pattern
(?<hostname>^(?:[^:]*\:){2}([^,]*))
resulting in
{
"hostname": [
[
"86:9:localhost"
]
]
}
where I want
{
"hostname": [
[
"localhost"
]
]
}
Give http://grokdebug.herokuapp.com/ a try. This is the best way to debug grok patterns that don't result in hair loss.
A pattern like this will extract the host name:
^(\d+)?:(\d+)?:(?<hostname>[^,]+),
Or writing it in a similar manner that you already wrote it:
^(?:[^:]*\:){2}(?<hostname>[^,]*)
The capture name needs to be inside the parenthesis that you want to capture... your pattern was capturing everything up to that point.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.