[英]How not to parse some fields by logstash?
I have a log-file which looks like this (simplified):我有一个看起来像这样的日志文件(简化):
{ "startDate": "2015-05-27", "endDate": "2015-05-27",
"request" : {"requestId":"123","field2":1,"field2": 2,"field3":3, ....} }
Log-stash tries to parse
all fields including field "request". Log-stash 尝试
parse
所有字段,包括字段“请求”。 But is it possible not to parse this field?但是有可能不解析这个字段吗?
I want to see the "request" field in elastic-search
but it shouldn't be parsed.我想在
elastic-search
看到“请求”字段,但不应对其进行解析。
here is a part of my config file:这是我的配置文件的一部分:
input {
file {
type => "json"
path => [
"/var/log/service/restapi.log"
]
tags => ["restapi"]
}
}
filter {
ruby {
init => "require 'socket'"
code => "
event['host'] = Socket.gethostname.gsub(/\..*/, '')
event['request'] = (event['request'].to_s);
"
}
if "restapi" in [tags] {
json {
source => "message"
}
date {
match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
target => "date_start"
}
date {
match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
target => "date_end"
}
date {
match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
target => "date"
}
}
}
output {
if "restapi" in [tags] {
elasticsearch {
hosts => ["......."]
template_name => "logs"
template => "/etc/logstash/templates/service.json"
template_overwrite => true
index => "service-logs-%{+YYYY.MM.dd}"
idle_flush_time => 20
flush_size => 500
}
}
}
here is my template file:这是我的模板文件:
{
"template" : "service-*",
"settings" : {
"index": {
"refresh_interval": "60s",
"number_of_shards": 6,
"number_of_replicas": 2
}
},
"mappings" : {
"logs" : {
"properties" : {
"@timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"@version" : { "type" : "integer", "index" : "not_analyzed" },
"message": { "type" : "string", "norms" : { "enabled" : false } },
"method" : { "type" : "string", "index" : "not_analyzed" },
"traffic_source" : { "type" : "string", "index" : "not_analyzed" },
"request_path" : { "type" : "string", "index" : "not_analyzed" },
"status" : { "type" : "integer", "index" : "not_analyzed" },
"host_name" : { "type" : "string", "index" : "not_analyzed" },
"environment" : { "type" : "string", "index" : "not_analyzed" },
"action" : { "type" : "string", "index" : "not_analyzed" },
"request_id" : { "type" : "string", "index" : "not_analyzed" },
"date" : { "type" : "date", "format" : "dateOptionalTime" },
"date_start" : { "type" : "date", "format" : "dateOptionalTime" },
"date_end" : { "type" : "date", "format" : "dateOptionalTime" },
"adnest_type" : { "type" : "string", "index" : "not_analyzed" },
"request" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
here is from logstash.log这是来自logstash.log
response=>{"create"=>{"_index"=>"logs-2017.02.08", "_type"=>"json", "_id"=>"AVoeNgdhD5iEO87EVF_n", "status" =>400, "error"=> "type"=>"mapper_parsing_exception", "reason"=>"failed to parse [request]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [requestId]" }}}}, :level=>:warn}
You should be able to do this with a ruby filter:您应该可以使用 ruby 过滤器执行此操作:
filter {
ruby {
init => "require 'socket'"
code => "
event['host'] = Socket.gethostname.gsub(/\..*/, '')
event['request'] = (event['request'].to_s);
"
}
if "restapi" in [tags] {
ruby {
code => '
require "json"
event.set("request",event.get("request").to_json)'
}
date {
match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
target => "date_start"
}
date {
match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
target => "date_end"
}
date {
match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
target => "date"
}
}
}
When testing this with stubbed out stdin/stdout:当用存根的 stdin/stdout 测试这个时:
input {
stdin { codec => json }
}
// above filter{} block here
output {
stdout { codec=>rubydebug}
}
And testing like this:并像这样测试:
echo '{ "startDate": "2015-05-27", "endDate": "2015-05-27", "request" : {"requestId":"123","field2":1,"field2": 2,"field3":3} }' | bin/logstash -f test.conf
It outputs this:它输出这个:
{
"startDate" => "2015-05-27",
"endDate" => "2015-05-27",
"request" => "{\"requestId\"=>\"123\", \"field2\"=>2, \"field3\"=>3}",
"@version" => "1",
"@timestamp" => "2017-02-09T14:37:02.789Z",
"host" => "xxxx"
}
So I've answered your original question.所以我已经回答了你原来的问题。 You should ask another question if you can't figure out why your template isn't working.
如果您不知道为什么您的模板不起作用,您应该问另一个问题。
ElasticSearch analyzes the field by default. ElasticSearch 默认分析该字段。 If what you need is just not to analyze the
request
field, change how this is indexed by setting "index": "not-analyzed"
in the mapping of the field.如果您需要的只是不分析
request
字段,请通过在字段映射中设置"index": "not-analyzed"
来更改它的索引方式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.