简体繁体 English

ELK堆栈和缩放

[英]ELK Stack and scaling

原文 2015-04-20 16:19:21 4 2 elasticsearch/ logstash/ kibana

Bear with me here. 在这里忍受我。 I have spent the last week or so familiarising myself with the ELK Stack. 我花了最后一周的时间来熟悉ELK Stack。

I have a working single box solution running the ELK stack, and I have the basics down on how to forward more than one type of log, and how to put them into different ES indexes. 我有一个运行ELK堆栈的有效的单盒解决方案，并且对如何转发不止一种类型的日志以及如何将它们放入不同的ES索引有基本的了解。

This is all working pretty well, I would like to expand operations. 一切工作都很好，我想扩大规模。

My question is more how to scale the solution out to cover more data needs/requirements. 我的问题是更多如何扩展解决方案以涵盖更多数据需求/要求。

The current solution is handling a smaller subset of data, and working fine, but I would like to aggregate a lot more data. 目前的解决办法是处理数据的一个较小的子集，并且工作正常，但我想聚集了很多数据。 For example I am currently pushing message tracking logs from 4 mailbox servers, I want to do the same but for 40 mailbox servers, and much, much busier ones. 例如，我目前正在推送来自4个邮箱服务器的邮件跟踪日志，我想这样做，但要配置40个邮箱服务器，还有很多繁忙的服务器。

I would also like to push over IIS Log files from the Client Access servers, there are 18 CAS servers, and around 30 mins of IIS logs per server during peak time were 120MB in size, with almost 1 million records. 我还想从客户端访问服务器推送IIS日志文件，有18台CAS服务器，并且在高峰时间每台服务器大约30分钟的IIS日志大小为120MB，有近100万条记录。

This volume of data would most likely collapse a single box running ELK. 这种数据量很可能会使运行ELK的单个框崩溃。

I haven't really looked into it but I read that ES allows for some form of clustering to add more instances, does the same apply to Logstash as well? 我还没有真正研究过它，但是我读到ES允许某种形式的集群来添加更多实例，Logstash也一样吗？ Should Kibana be run on more than one server? Kibana是否应在多台服务器上运行？ or a different server to both Logstash and ES? 还是Logstash和ES的服务器不同？

2 个解决方案

You will hit limits with logstash if you're doing a lot of processing on the records - groks, conditionals, etc. Watch the cpu utilization of the machine for hints. 如果您在记录上进行大量处理（例如黑眼症，条件病等），则将使用logstash达到限制。请注意机器的CPU使用率以获取提示。

For elasticsearch itself, it's about RAM and disk IO. 对于Elasticsearch本身，它与RAM和磁盘IO有关。 Having more nodes in a cluster should provide both. 群集中有更多节点应同时提供这两个方面。

With two elasticsearch nodes, you'll get redundancy (a copy on both machines). 使用两个elasticsearch节点，您将获得冗余（两台计算机上都有一个副本）。 Add a third, and you can start to realize an IO benefit (writing two copies to three machines spreads the IO). 添加第三个，您就可以开始实现IO的好处（将两个副本写入三台计算机可扩展IO）。

The ultimate data node will have 64GB of RAM on the machine, with 31GB allocated to elasticsearch. 最终数据节点将在计算机上具有64GB的RAM，其中31GB分配给elasticsearch。

You'll probably want to add non-data nodes, which handle the routing of data to be indexed and the 'reduce' phase when running queries. 您可能需要添加非数据节点，这些节点将在运行查询时处理要编制索引的数据的路由以及“减少”阶段。 Put two of them behind a load balancer. 将其中两个放在负载均衡器后面。

As Alain mentioned, adding more ES nodes will improve performance (and give you redundancy). 如Alain所述，添加更多ES节点将提高性能（并为您提供冗余）。

On the logstash front, we have two logstash servers feeding into ES - at the moment we just direct different servers to log to the different logstash servers, but we're likely to be adding a HA-Proxy layer in front to do this automatically, and again provide redundancy. 在logstash方面，我们有两个logstash服务器馈入ES-目前，我们只是将不同的服务器定向到不同的logstash服务器，但是我们可能会在前面添加一个HA-Proxy层来自动执行此操作，并再次提供冗余。

With Kibana, I wouldn't worry too much - as far as I'm aware most of the processing is done in the client browser, and that that isn't is more dependent on the performance of the ES cluster. 使用Kibana，我不必担心太多-据我所知，大多数处理都是在客户端浏览器中完成的，而这更多地取决于ES群集的性能。