简体   繁体   中英

Logstash high availability deployment

I am using logstash in a mode where it reads log files from disk and puts in ElasticSearch.

What is the best way to deploy logstash for high availability (especially failover)? I'm ok with both active/active mode where two logstash instances are always active, and with active/passive mode where one instance is working and the other one will start only if the first one is down.

I'm specifically asking about logstash and not ElasticSearch.

It seems, that Logstash does not have built in HA options, where we remain with Linux classic - Virtual IP. I was thinking on the same topick, and currently decided to try the following option (hot/cold version):

  • build 2 separate server instances with Logstash as indexer
  • find a way, to sync .conf files of Logstash indexer (rsync, git, etc.)
  • use Virtual IP solution and Linux heartbeat, to move active Virtual IP between servers or use other load balancing solution, which could act as such (for example, pfsense as load balancer)
  • each Logstash indexer instance has it's own REDIS instance, to keep a buffer of logs, and potentially allow to move logs from buffer, if something goes wrong with Logstash.

Here are issues, that need to be solved out, yet:

  • Redis can not be run in HA active/active, which introduce issue of log message routing and finding during or after instance switch.
  • The same with Active/Passive Logstash, when switching occurs - how to get missing logs within this timeframe.

As far as I know, Active/Active Logstash is available only with the following options taken in mind:

  • logs are doubled. If you put both indexer nodes for output in Logstash shippers.
  • or you have to provide logical mechanism to ship logs on conditions outside Logstash indexer configuration - eg figure it out, not to ship the same log messages to both indexers.

You could use a queue that will act as a buffer between input and indexing process.

It's always a good thing to separate tier with a queue, so if elasticsearch crashs, your application will not suffer.

在这种情况下,最好的方法是使用某种硬件平衡器,例如F5(如果有的话)池,因此您要定义具有相应端口的VIP,然后将该VIP与主机的N个IP地址相关联。 N个logstash主机,因此您可以从logstash中获得任意数量的节点或需要运行的任何服务,然后应用循环算法并平衡连接。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM