简体   繁体   中英

ElasticSearch/Logstash data gaps

I am using a minimal Logstash setup and the Syslog input to collect execution data from three remote systems.

This works fine but some times there are data gaps. Some log entries available in the original log files do not make it to ElasticSearch.

My question is whether Logstash drops data when the load increases.

If yes, I would like to know:

  1. How to confirm that in my case. Is there a way to monitor Logstash for such data loses? Will Logstash throw an error or similar?
  2. Recommendations for avoiding such data gaps. I already decreased the frequency of produced log events. I assume the next step would be to take one of the scaling approaches proposed here: deploying-and-scaling-logstash

Thanks, Michail

The current version of logstash (1.x) has a very small pipeline queue, and will not accept more messages than that if it becomes congested. For a file{} input, this isn't a problem, because the file will continue to sit on disk waiting for logstash to resume. For syslog, which has no buffer, messages would be lost.

The current recommendation is to put a broker in between (redis, rabbitmq), which can grow when logstash is congested.

logstash 2.0 is said to have a real pipeline cache, so an extra broker won't be required.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM