简体   繁体   English

创建自定义 GROK 模式

[英]Creating a custom GROK pattern

currently, I'm trying to create a grok pattern for this log目前,我正在尝试为此日志创建一个 grok 模式

2020-03-11 05:54:26,174 JMXINSTRUMENTS-Threading [{"timestamp":"1583906066","label":"Threading","ObjectName":"java.lang:type\u003dThreading","attributes":[{"name":"CurrentThreadUserTime","value":18600000000},{"name":"ThreadCount","value":152},{"name":"TotalStartedThreadCount","value":1138},{"name":"CurrentThreadCpuTime","value":20804323112},{"name":"PeakThreadCount","value":164},{"name":"DaemonThreadCount","value":136}]}]

At the moment I can match correctly until the JMXINTRUMENTS-Threading by using this pattern:目前我可以通过使用此模式正确匹配直到 JMXINTRUMENTS-Threading:

%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}

But I can not seem to match all the values after this.但是在此之后我似乎无法匹配所有值。 Has anybody got an idea as to what pattern I should use?有没有人知道我应该使用什么模式?

It worked for me after defining a different source and target in the JSON filter.在 JSON 过滤器中定义不同的源和目标后,它对我有用。 Thanks for the help!谢谢您的帮助!

filter {
    if "atlassian-jira-perf" in [tags] {
    grok {
     match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message_raw}" }
     tag_on_failure => ["no_match"]
     add_tag => ["bananas"]
  }
  if "no_match" not in [tags] {
    json {
      source => "log_message_raw"
      target => "parsed"
    }
  }
  mutate {    
        remove_field => ["message"]
    }
}
}

i'm trying your pattern in https://grokdebug.herokuapp.com/ (which is the official debugger for logstash) and it does match everything after "JMXINTRUMENTS-Threading" with your pattern in a big field called log message, in this way:我正在https://grokdebug.herokuapp.com/ (这是logstash的官方调试器)中尝试您的模式,它确实将“JMXINTRUMENTS-Threading”之后的所有内容与您在名为日志消息的大字段中的模式相匹配,在此道路:

{
  "timestamp": [
    [
      "2020-03-11 05:54:26,174"
    ]
  ],
  "YEAR": [
    [
      "2020"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "11"
    ]
  ],
  "HOUR": [
    [
      "05",
      null
    ]
  ],
  "MINUTE": [
    [
      "54",
      null
    ]
  ],
  "SECOND": [
    [
      "26,174"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "instrument": [
    [
      "JMXINSTRUMENTS-Threading"
    ]
  ],
  "log_message": [
    [
      "[{"timestamp":"1583906066","label":"Threading","ObjectName":"java.lang:type\\u003dThreading","attributes":[{"name":"CurrentThreadUserTime","value":18600000000},{"name":"ThreadCount","value":152},{"name":"TotalStartedThreadCount","value":1138},{"name":"CurrentThreadCpuTime","value":20804323112},{"name":"PeakThreadCount","value":164},{"name":"DaemonThreadCount","value":136}]}]"
    ]
  ]
}

if you wish to match all the field contained in log message you should use a json filter in your logstash pipeline filter section, just right below your grok filter:如果您希望匹配日志消息中包含的所有字段,您应该在 logstash 管道过滤器部分中使用 json 过滤器,就在您的 grok 过滤器下方:

For example:例如:

  grok {
     match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}" }
     tag_on_failure => ["no_match"]
  }
  if "no_match" not in [tags] {
    json {
      source => "log_message"
    }
  }

In that way your json will be splitted in key: value and parsed.这样,您的 json 将被拆分为 key: value 并进行解析。

EDIT:编辑:

You could try to use a kv filter instead of json, here the docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html您可以尝试使用 kv 过滤器而不是 json,这里是文档: https : //www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html

grok {
     match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}" }
     tag_on_failure => ["no_match"]
  }
  if "no_match" not in [tags] {
    kv {
      source => "log_message"
      value_split => ":" 
      include_brackets => true #remove brackets
      remove_char_key => "\""
      remove_char_value => "\""
      field_split => ","
    }
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM