简体   繁体   中英

Logstash Kafka Input , Logstash worker ordering in data consumption

I have used Logstash for Kafka to Elastic search sync. The input topic has 8 partitions and I have used consumer_threads=8 to consumer the Kafka topic in parallel.

input { kafka { bootstrap_servers => "bootstrapServer" topics => "topicName" codec => json group_id => "groupName" id => "" consumer_threads => 8 } }

After the input section, I have a filter and Output in Logstash logic.

How can I increase the Logstash worker parallelism without affecting the ordering of data in a kafka partition?

Does Logstash using an in-memory queue in between input and (filter and output)? How to ensure that data from a single partition is consumed by a single filter and output thread of Logstash.

You cannot have multiple worker threads process data in parallel and also preserve the order of data. Even with a single thread logstash does not preserve the order of data by default, you need to set pipeline.workers to 1 and also set pipeline.ordered to 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM