简体   繁体   中英

Logstash grok filter custom pattern is not working

I've a log file ( http://codepad.org/vAMFhhR2 ), and I want to extract a specific number out of it (line 18) I wrote a custom pattern grok filter, tested it on http://grokdebug.herokuapp.com/ , it works fine and extracts my desired value.

here's how logstash.conf looks like:

input {
    tcp {
        port => 5000

filter {
         match => [ "message", "(?<scraped>(?<='item_scraped_count': ).*(?=,))" ]

output {
    elasticsearch {
        hosts => "elasticsearch:9200"

but it doesn't match any record from the same log on Kibana


Your regexp may be valid but the lookahead and lookbehind ("?=" and "?<=") are not a good choice in this context. Instead you could use a much simpler filter:

match => [ "message", "'item_scraped_count': %{NUMBER:scraped}" ]

This will extract the number after 'item_scraped_count': as a field called scraped , using the 'NUMBER' Grok built-in pattern .

Result in Kibana:

  "_index": "logstash-2017.04.11",
  "_type": "logs",
  "_source": {
    "@timestamp": "2017-04-11T20:02:13.194Z",
    "scraped": "22",

If I may suggest another improvement: since your message is spread across multiple lines you could easily merge it using the multiline input codec:

input {
    tcp {
        port => 5000
        codec => multiline {
            pattern => "^(\s|{')"
            what => "previous"

This will merge all the lines starting with either a whitespace or {' with the previous one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM