简体   繁体   English

Logstash / Elasticsearch / Kibana资源计划

[英]Logstash/Elasticsearch/Kibana resource planning

How to plan resources (I suspect, elasticsearch instances) according to load: 如何根据负载计划资源(我怀疑是Elasticsearch实例):

With load I mean ≈500K events/min, each containing 8-10 fields. 在负载下,我的意思是每分钟≈500K事件,每个事件包含8-10个字段。

What are the configuration knobs I should turn? 我应该转动哪些配置旋钮? I'm new to this stack. 我是这个堆栈的新手。

500,000 events per minute is 8,333 events per second, which should be pretty easy for a small cluster (3-5 machines) to handle. 每分钟500,000个事件是每秒8,333个事件,对于一个小型集群(3-5台机器)来说,这应该很容易处理。

The problem will come with keeping 720M daily documents open for 60 days (43B documents). 问题在于将720M每日文档打开60天(43B文档)。 If each of the 10 fields is 32 bytes, that's 13.8TB of disk space (nearly 28TB with a single replica). 如果10个字段中的每一个都是32字节,则为13.8TB磁盘空间(单个副本将近28TB)。

For comparison, I have 5 nodes at the max (64GB of RAM, 31GB heap), with 1.2B documents consuming 1.2TB of disk space (double with a replica). 为了进行比较,我最大有5个节点(64GB的RAM,31GB的堆),其中1.2B文档占用了1.2TB磁盘空间(使用副本则增加了一倍)。 This cluster could not handle the load with only 32GB of RAM per machine, but it's happy now with 64GB. 该群集无法通过每台计算机仅32GB的RAM来处理负载,但现在对64GB的内存感到满意。 This is 10 days of data for us. 这是我们的10天数据。

Roughly, you're expecting to have 40x the number of documents consuming 10x the disk space than my cluster. 大致来说,您期望的文件数量是磁盘的40倍,而磁盘空间是群集的10倍。

I don't have the exact numbers in front of me, but our pilot project for using doc_values is giving us something like a 90% heap savings. 我没有确切的数字,但是我们使用doc_values的试验项目为我们节省了90%的堆空间。

If all of that math holds, and doc_values is that good, you could be OK with a similar cluster as far as actual bytes indexed were concerned. 如果所有这些数学都成立,并且doc_values这么好,那么就所索引的实际字节而言,您可以使用类似的群集。 I would solicit additional information on the overhead of having so many individual documents. 我将征询有关拥有如此多的单个文档的开销的其他信息。

We've done some amount of elasticsearch tuning, but there's probably more than could be done as well. 我们已经完成了一些Elasticsearch调整,但是可能还有很多事情要做。

I would advise you to start with a handful of 64GB machines. 我建议您从少数64GB机器开始。 You can add more as needed. 您可以根据需要添加更多。 Toss in a couple of (smaller) client nodes as the front-end for index and search requests. 引入几个(较小的)客户端节点作为索引和搜索请求的前端。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM