简体   繁体   中英

Logstash grok failing

Am trying to grok a message but its failing with _grokparsefailure in log but doesn't actually say what it's failing on. The grok query works on https://grokdebug.herokuapp.com/

input {
  file {
  type => "apache-access"
  path => "C:/prdLogs/sent/*"
}
   filter {
   grok {
  match => ['message', '%{IP:clientip} - - \[%{GREEDYDATA:raw_timestamp}   \] "%{WORD:httpmethod} %{NOTSPACE:referrer} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} "-" "%{NOTSPACE:request}" %{QS:UserAgent} %{WORD:httpmethodO} - - HTTP/%{NUMBER:httpversion2} "%{WORD:session}:%{WORD:httpmed}" "-" %{NUMBER:duration}' ]
}
   date {
    match => [ "raw_timestamp" , 'dd/MMM/yyyy:HH:mm:ss Z' ]
    target => '@timestamp'
   }
  }

   output {
elasticsearch { hosts => ["111.44.44.44:9200"] }
  }

The data looks like:

199.77.22.22 - - [26/Feb/2017:10:18:45 +0800] "GET /myapp/app/i18n/key/parent.selector.label.select.item/?locale=en_GB&dojo.preventCache=1488075524942 HTTP/1.1" 200 "-" "https://mywebsite.here.com:31000/myApp/home.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; Tablet PC 2.0)" GET - - HTTP/1.1 "0000bKOk4n4SSBHuyJJKed085D6:1ap8u8p8j" "-" 3203
199.77.22.22 - - [26/Feb/2017:10:18:45 +0800] "GET /myapp/app/i18n/key/parent.selector.label.no.recently.used/?locale=en_GB&dojo.preventCache=1488075525483 HTTP/1.1" 200 "-" "https://mywebsite.here.com:31000/myApp/home.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; Tablet PC 2.0)" GET - - HTTP/1.1 "0000bKOk4n4SSBHuyJJKed085D6:1ap8u8p8j" "-" 3159
199.77.22.22 - - [26/Feb/2017:10:18:46 +0800] "GET /myapp/app/i18n/key/selector.label.selected/?locale=en_GB&dojo.preventCache=1488075525843 HTTP/1.1" 200 "-" "https://mywebsite.here.com:31000/myApp/home.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; Tablet PC 2.0)" GET - - HTTP/1.1 "0000bKOk4n4SSBHuyJJKed085D6:1ap8u8p8j" "-" 3600
199.77.22.22 - - [26/Feb/2017:10:18:46 +0800] "GET /myapp/app/i18n/key/actor.selector.label.remove.all/?locale=en_GB&dojo.preventCache=1488075526305 HTTP/1.1" 200 "-" "https://mywebsite.here.com:31000/myApp/home.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; Tablet PC 2.0)" GET - - HTTP/1.1 "0000bKOk4n4SSBHuyJJKed085D6:1ap8u8p8j" "-" 3224
199.77.22.22 - - [26/Feb/2017:10:18:46 +0800] "GET /myapp/app/i18n/key/com.label.filter.objects/?locale=en_GB&dojo.preventCache=1488075526711 HTTP/1.1" 200 "-" "https://mywebsite.here.com:31000/myApp/home.do" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; Tablet PC 2.0)" GET - - HTTP/1.1 "0000bKOk4n4SSBHuyJJKed085D6:1ap8u8p8j" "-" 3299

This is actually an apache access log but I was unable to use COMBINEDAPACHELOG or COMMONAPACHELOG. Same error actually!!

All entries in elasticsearch are tagged as "_grokparsefailure". I ran logstash in debug mode with log.level at debug but am not seeing any errors in the log.

Am using the latest version of logstash.

Please advise.

R2 D2 Thanks, I tried this but no joy :(

I created a patterns file and pasted your pattern. I just changed the payload to just "130.39.22.22 - - [23/Feb/2015:10:18:45 +0800]" and the following was my filter:

filter {

grok {
      patterns_dir => ["c:/logstashconfig/patterns"]
      match => ['message', '%{IP:clientip} - - /[%{DATE_CUSTOM:timestamp}/]'] 
    }
date {
    match => [ "timestamp" , 'dd/MMM/yyyy:HH:mm:ss Z' ]
    target => '@timestamp'
  }
}

The debug log in logstash:

{
      "path" => "C:/prdLogs/sent/test",
"@timestamp" => 2017-03-03T00:06:15.269Z,
      "@version" => "1",
      "host" => "hkw20012125",
   "message" => "130.39.22.22 - -     [23/Feb/2015:10:18:45 +0800]\r",
      "type" => "apache-access",
      "tags" => [
    [0]     "_grokparsefailure"
]   
}

Any ideas? Is it the +0800 at the end of the data? Thanks.

When you have to build your own patterns, start from the left side, go slowly, and use the debugger .

If you test this pattern:

%{IP:clientip} - - \[

it works, but this one:

%{IP:clientip} - - \[%{GREEDYDATA:raw_timestamp}   \]

doesn't. Comparing your pattern to the input shows that there aren't spaces between the timestamp and the close bracket.

Changing this part of the pattern to:

%{IP:clientip} - - \[%{GREEDYDATA:raw_timestamp}\]

works.

I think once you have GREEDYDATA in your pattern, it means to consider rest of your line from the log:

GREEDYDATA 's pattern looks like:

GREEDYDATA .* <-- means to capture the entire line

And your grok match should look something like this if I'm not mistaken:

grok {
  match => ['message', '%{IPV4:clientip} - - %{GREEDYDATA:data}']
}

unless you need the values to be extracted separately, the above grok should do the trick for you. And I think the way you're matching the timestamp is wrong. In order to handle your timestamp you need to have the below patterns within your patterns file:

MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
MONTH \b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b
YEAR (?>\d\d){1,2}
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
DATE_CUSTOM %{MONTHDAY}[/]%{MONTH }[/]%{YEAR}:%{TIME}

And then you could simply use this within your grok match:

grok {
    match => ['message', '%{IPV4:clientip} - - \[%{DATE_CUSTOM:timestamp} %{GREEDYDATA:data}']
}

Now you'll be able to match the timestamp as:

date {
    match => [ "timestamp" , 'dd/MMM/yyyy:HH:mm:ss Z' ]
    target => '@timestamp'
}

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM