[英]How not to parse some fields by logstash?
我有一个看起来像这样的日志文件(简化):
{ "startDate": "2015-05-27", "endDate": "2015-05-27",
"request" : {"requestId":"123","field2":1,"field2": 2,"field3":3, ....} }
Log-stash 尝试parse
所有字段,包括字段“请求”。 但是有可能不解析这个字段吗?
我想在elastic-search
看到“请求”字段,但不应对其进行解析。
这是我的配置文件的一部分:
input {
file {
type => "json"
path => [
"/var/log/service/restapi.log"
]
tags => ["restapi"]
}
}
filter {
ruby {
init => "require 'socket'"
code => "
event['host'] = Socket.gethostname.gsub(/\..*/, '')
event['request'] = (event['request'].to_s);
"
}
if "restapi" in [tags] {
json {
source => "message"
}
date {
match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
target => "date_start"
}
date {
match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
target => "date_end"
}
date {
match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
target => "date"
}
}
}
output {
if "restapi" in [tags] {
elasticsearch {
hosts => ["......."]
template_name => "logs"
template => "/etc/logstash/templates/service.json"
template_overwrite => true
index => "service-logs-%{+YYYY.MM.dd}"
idle_flush_time => 20
flush_size => 500
}
}
}
这是我的模板文件:
{
"template" : "service-*",
"settings" : {
"index": {
"refresh_interval": "60s",
"number_of_shards": 6,
"number_of_replicas": 2
}
},
"mappings" : {
"logs" : {
"properties" : {
"@timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"@version" : { "type" : "integer", "index" : "not_analyzed" },
"message": { "type" : "string", "norms" : { "enabled" : false } },
"method" : { "type" : "string", "index" : "not_analyzed" },
"traffic_source" : { "type" : "string", "index" : "not_analyzed" },
"request_path" : { "type" : "string", "index" : "not_analyzed" },
"status" : { "type" : "integer", "index" : "not_analyzed" },
"host_name" : { "type" : "string", "index" : "not_analyzed" },
"environment" : { "type" : "string", "index" : "not_analyzed" },
"action" : { "type" : "string", "index" : "not_analyzed" },
"request_id" : { "type" : "string", "index" : "not_analyzed" },
"date" : { "type" : "date", "format" : "dateOptionalTime" },
"date_start" : { "type" : "date", "format" : "dateOptionalTime" },
"date_end" : { "type" : "date", "format" : "dateOptionalTime" },
"adnest_type" : { "type" : "string", "index" : "not_analyzed" },
"request" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
这是来自logstash.log
response=>{"create"=>{"_index"=>"logs-2017.02.08", "_type"=>"json", "_id"=>"AVoeNgdhD5iEO87EVF_n", "status" =>400, "error"=> "type"=>"mapper_parsing_exception", "reason"=>"failed to parse [request]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [requestId]" }}}}, :level=>:warn}
您应该可以使用 ruby 过滤器执行此操作:
filter {
ruby {
init => "require 'socket'"
code => "
event['host'] = Socket.gethostname.gsub(/\..*/, '')
event['request'] = (event['request'].to_s);
"
}
if "restapi" in [tags] {
ruby {
code => '
require "json"
event.set("request",event.get("request").to_json)'
}
date {
match => [ "date_start", "yyyy-MM-dd HH:mm:ss" ]
target => "date_start"
}
date {
match => [ "date_end", "yyyy-MM-dd HH:mm:ss" ]
target => "date_end"
}
date {
match => [ "date", "yyyy-MM-dd HH:mm:ss" ]
target => "date"
}
}
}
当用存根的 stdin/stdout 测试这个时:
input {
stdin { codec => json }
}
// above filter{} block here
output {
stdout { codec=>rubydebug}
}
并像这样测试:
echo '{ "startDate": "2015-05-27", "endDate": "2015-05-27", "request" : {"requestId":"123","field2":1,"field2": 2,"field3":3} }' | bin/logstash -f test.conf
它输出这个:
{
"startDate" => "2015-05-27",
"endDate" => "2015-05-27",
"request" => "{\"requestId\"=>\"123\", \"field2\"=>2, \"field3\"=>3}",
"@version" => "1",
"@timestamp" => "2017-02-09T14:37:02.789Z",
"host" => "xxxx"
}
所以我已经回答了你原来的问题。 如果您不知道为什么您的模板不起作用,您应该问另一个问题。
ElasticSearch 默认分析该字段。 如果您需要的只是不分析request
字段,请通过在字段映射中设置"index": "not-analyzed"
来更改它的索引方式。
来自此处文档的更多信息
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.