简体   繁体   中英

logstash changing ordering of json input to elastic search

I have a logstash process running which is consuming data from a kafka topic. Messages in kafka topic are already in json format. logstash is simply pushing them in elastic search. But while doing so logstash changes ordering of the fields. There is a team which is consuming csv format of the data, so the changed ordering gives them trouble. What could be the reason?

for eg input json {"foo1":"bar1","foo2":"bar2"}. logstash pushes it in elastic then in elastic it looks like {"foo2":"bar2","foo1":"bar1"}

logstash config

input{
    kafka{
        codec=>'json' bootstrap_servers => [localhost:9092] topics =>  'sample-logs' auto_offset_reset => 'earliest' => group_id => 'logstash-consumer'
    }
}
output {
     elasticsearch { 
         hosts => "localhost:9200", codec => json index=> "sample-logs-es" } 
stdout { 
    codec => rubydebug 
}

Two good reason to have it in the same order or sorted:

  1. the _source fields better compress if you have a lot of similar data
  2. Easier for humans looking at the data in Kibana

I have a logstash Ruby scripts that corrects for version updates in the code processing and some past mistakes. Sadly I also get random order JSON out of it. And also have no idea yet on how to get it sorted again for ingestion into Elastic. Crude aproach would be dumping all to file, use JQ and then ingest directly.

将 pipeline.worker 设置为 1,否则多个 worker 将执行过滤器 + 并行输出

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM