简体   繁体   English

logstash:grok解析失败

[英]logstash: grok parse failure

I have this config file 我有这个配置文件

input {
  stdin {}
   file {
    type => "txt"
    path => "C:\Users\Gck\Desktop\logsatash_practice\input.txt"
    start_position=>"beginning"
  }
}


filter {
    grok {
        match => [ "message", "%{DATE:timestamp} %{IP:client} %{WORD:method} %{WORD:text}"]
      }
    date {
        match => [ "timestamp", "MMM-dd-YYYY-HH:mm:ss" ]
        locale => "en"
    }
}

output {
    file {
        path => "C:\Users\Gck\Desktop\logsatash_practice\op\output3.txt"
    }
}

and lets say this is my input: 并说这是我的输入:

MAY-08-2015-08:00:00 55.3.244.1 GET hello MAY-08-2015-08:00:00 55.3.244.1问候你好

MAY-13-2015-13:00:00 56.4.245.2 GET world MAY-13-2015-13:00:00 56.4.245.2获取世界

After running it, I get a message of: grokparse failure. 运行它后,我收到一条消息:grokparse失败。

this is the output: 这是输出:

{"message":"MAY-08-2015-08:00:00\\t55.3.244.1\\thello\\r","@version":"1","@timestamp":"2015-05-11T12:51:05.268Z","type":"txt","host":"user-PC","path":"C:\\Users\\Gck\\Desktop\\logsatash_practice\\input.txt","tags":["_grokparsefailure"]} { “消息”: “MAY-08-2015-08:00:00 \\ t55.3.244.1 \\ thello \\ r”, “@版本”: “1”, “@时间戳”:“2015-05-11T12: 51:05.268Z”, “类型”: “TXT”, “宿主”: “用户-PC”, “路径”: “C:\\用户\\ GCK \\桌面\\ logsatash_practice \\ input.txt中”, “标签”:[ “_grokparsefailure”]}

{"message":"MAY-13-2015-13:00:00\\t56.4.245.2\\tworld\\r","@version":"1","@timestamp":"2015-05-11T12:51:05.269Z","type":"txt","host":"user-PC","path":"C:\\Users\\Gck\\Desktop\\logsatash_practice\\input.txt","tags":["_grokparsefailure"]} { “消息”: “MAY-13-2015-13:00:00 \\ t56.4.245.2 \\ t世界\\ r”, “@版本”: “1”, “@时间戳”:“2015-05-11T12: 51:05.269Z”, “类型”: “TXT”, “宿主”: “用户-PC”, “路径”: “C:\\用户\\ GCK \\桌面\\ logsatash_practice \\ input.txt中”, “标签”:[ “_grokparsefailure”]}

What do I do wrong? 我做错了什么?

Not less important- is there any guide that sums up this filtering thing in a good clear way? 同样重要的是-是否有任何指南可以很好地总结这种过滤方法? elastic guides aren't detailed enough. 弹性导杆不够详细。

The DATE grok pattern is defined like this: DATE grok模式的定义如下:

DATE %{DATE_US}|%{DATE_EU}

DATE_US and DATE_EU are in turned defined like this: DATE_US和DATE_EU的定义如下:

DATE_US %{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}

I could continue, but it's already clear that this doesn't match the actual content of your log message sample: 我可以继续,但是已经很清楚,这与您的日志消息示例的实际内容不匹配:

MAY-08-2015-08:00:00 55.3.244.1 GET hello

There's no stock grok pattern that matches this date format but it's easy to put together a custom one. 没有与该日期格式匹配的stock grok模式,但是很容易将自定义模式组合在一起。 Also, note that the separator between the tokens in your log messages aren't spaces but tabs. 另外,请注意,日志消息中标记之间的分隔符不是空格,而是制表符。 We can use \\s to match any whitespace character. 我们可以使用\\s来匹配任何空白字符。 Working example: 工作示例:

(?<timestamp>%{WORD}-%{MONTHDAY}-%{YEAR}-%{TIME})\s%{IP:client}\s%{WORD:method}\s%{WORD:text}

Not less important- is there any guide that sums up this filtering thing in a good clear way? 同样重要的是-是否有任何指南可以很好地总结这种过滤方法? elastic guides aren't detailed enough. 弹性导杆不够详细。

With the exception of the grok-specific %{PATTERN_NAME:variable} notation this is all just plain regular expressions, and there are many introductory guides for those elsewhere. 除了特定于grok的%{PATTERN_NAME:variable}表示法之外,这只是普通的正则表达式,其他地方都有许多入门指南。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM