简体   繁体   English

每天约200G日志的Elasticsearch集群设计

[英]Elasticsearch cluster design for ~200G logs a day

I've created ES cluster (version 5.4.1) with 4 data nodes, 3 master, one client node (kibana). 我创建了具有4个数据节点,3个主节点,1个客户端节点(kibana)的ES群集(5.4.1版)。

The data nodes are r4.2xlarge aws instance (61g memory, 8vCPU) with 30G memory allocated for the ES JAVA. 数据节点是r4.2xlarge aws实例(61g内存,8vCPU),为ES JAVA分配了30G内存。

We're writing around 200G of logs every day and keep it for the last 14 days. 我们每天要写大约200G的日志,并保留过去14天。

I'm looking for recommendations to our cluster to improve the cluster performance, especially the search performance (kibana). 我正在为我们的集群寻求建议,以提高集群性能,尤其是搜索性能(菊苣)。

More data nodes? 更多数据节点? more client nodes? 更多的客户端节点? bigger nodes? 更大的节点? more replica's? 更多副本? anything that can improve the performance is an option. 任何可以提高性能的选项都是可选的。

Is there anyone with something close to this design or loads? 有没有人接近这个设计或负载? I'll be happy to hear about other designs and loads. 我很高兴听到其他设计和负载。

Thanks, Moshe 谢谢,Moshe

  1. How many shards are you using? 您正在使用多少个碎片? The default of 5? 默认为5? That would even be a pretty good number. 这甚至是一个相当不错的数字。 Depending on who you ask a shard should be between 10G and 50G; 分片的大小取决于您的要求,分片应在10G到50G之间; with a logging use-case probably rather on the 50GB side. 与日志记录用例有关,而不是在50GB方面。
  2. Which queries do you want to speed up? 您想加快哪些查询? Do they mainly target recent data or long time-spans? 它们主要针对近期数据还是长时间跨度? If you are mainly interested in recent data, you could use different node types in a hot-warm architecture. 如果您主要对最新数据感兴趣,则可以在热热架构中使用不同的节点类型。 More power to the nodes with recent data and less data; 使用最新数据和更少数据为节点提供更多功能; the bulk of older and less frequently accessed data on less powerful nodes. 功能较弱的节点上的大量较旧且访问频率较低的数据。
  3. Generally you'll need to find your bottleneck. 通常,您需要找到瓶颈。 I'd get the free monitoring plugin and take a look at how both Kibana and Elasticsearch are doing. 我将获得免费的监视插件,并看看Kibana和Elasticsearch的表现如何。

Wild guess: You are limited on IO. 大胆的猜测:您在IO方面受到限制。 Prefer local disks over EBS, prefer SSDs over spinning disks, and if you can, get as many IOPS as you can afford for that use-case. 与EBS相比,本地磁盘更受欢迎,与旋转磁盘相比,SSD更受欢迎。如果可以的话,可以得到尽可能多的IOPS。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 Elasticsearch 日志迁移到不同的集群 - Migrating Elasticsearch logs to a different cluster logstash-过滤日志并发送到不同的Elasticsearch集群 - logstash - filter logs and send to different elasticsearch cluster 无法将日志推送到 Elasticsearch 集群 - ELK 堆栈内存溢出 - Could not push logs to Elasticsearch cluster - ELK Stack 将Kubernetes群集日志发送到AWS Elasticsearch - Send Kubernetes cluster logs to AWS Elasticsearch 即使它存在于Elasticsearch集群中,日志也不会出现在Kibana中 - the logs do not appear in Kibana even if it exist on the cluster of elasticsearch Elasticsearch 集群故障后 Filebeat 无法发送日志 - Filebeat can't send logs after Elasticsearch cluster failure 弹性搜索集群 - Elasticsearch cluster 使用无痛脚本在 Elasticsearch 生产集群中将日期时间字段增加一天 - Incrementing Datetime field by one day in Elasticsearch Production Cluster using painless script Elasticsearch:2亿个别名? - Elasticsearch : 200 million aliases ? error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push log to Elasticsearch cluster - error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM