简体繁体 English

Elasticsearch缓慢的搜索查询性能

[英]Elasticsearch slow search query performance

原文 2015-11-05 00:41:34 6 1 java/ database/ elasticsearch/ lucene/ elasticsearch-query

I'm having a lot of issues tuning Elasticsearch to give a high search query performance. 我在调整Elasticsearch以获得较高的搜索查询性能时遇到很多问题。 These are my specs: 这些是我的规格：

ES Setup: Version: 0.90.0, 2 nodes (m3.2xlarge aws intances) in cluster, 32GB RAM each, 50% allocated to ES_HEAP_SIZE, no swapping ES设置：版本：0.90.0，群集中有2个节点（m3.2x大aws实例），每个节点32GB RAM，50％分配给ES_HEAP_SIZE，无交换

Data: 75MM documents, 25 fields each 资料：75MM文件，每个25栏

Queries made for benchmark: Multimatch query against 5 text fields 针对基准的查询：针对5个文本字段的多重匹配查询

I've tried everything mentioned here and here 我已经尝试过这里和这里提到的一切

Upto a 30 requests/sec input query frequency, the response time stays less than 1s. 输入查询频率高达30个请求/秒，响应时间保持小于1s。 Above that 30+ requests/sec, the performance plummets and response time increases to 50s. 超过30+请求/秒，性能骤降，响应时间增加到50s。 While this happens JVM Heap is stable (around 7-8 in bigdesk) and GC is also stable. 发生这种情况时，JVM Heap稳定（在bigdesk中约为7-8），GC也稳定。 However, the CPU rapidly increases and is at 800% (8-core) and load average is very high 16. The hot threads keep switching between search and scoring functions like BooleanScorer2.nextDoc, BooleanQuery.createWeight, DisjunctionSumScorer.advance, BufferedIndexInput.refill and such 但是，CPU迅速增加并达到800％（8核），平均负载非常高16。热线程不断在搜索和评分函数（例如BooleanScorer2.nextDoc，BooleanQuery.createWeight，DisjunctionSumScorer.advance，BufferedIndexInput.refill）之间进行切换。这样的

Question: Could you help me find out why performance plummets after 30 req/sec and how to resolve this by changing the cluster configuration if possible. 问题：您能否帮助我找出为什么性能在30 req / sec之后下降的原因，以及如何通过更改集群配置来解决此问题。

Thanks in advance! 提前致谢！

1 个解决方案

I know you're seeing CPU-bound behavior, but are you seeing any I/O spikes around when you start having performance issues? 我知道您看到的是CPU受限的行为，但是当您开始遇到性能问题时，是否看到I / O峰值？

If you're storing your index on EBS volumes I wouldn't be surprised to start seeing I/O saturation with a test like yours. 如果您将索引存储在EBS卷上，那么使用类似您的测试开始看到I / O饱和将不会感到惊讶。 M3 instances have fast local (ephemeral) SSD volumes, and if you're tuning for responsiveness, you should make sure your index is stored locally. M3实例具有快速的本地（临时）SSD卷，并且如果要调整响应速度，则应确保索引存储在本地。

https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing

I realize this doesn't speak directly to the CPU issue, but anything you can do to make a single query more responsive (including filtering/etc...) will boost your throughput. 我意识到这并不能直接解决CPU问题，但是您可以采取任何措施使单个查询的响应速度更快（包括过滤/等），这将提高您的吞吐量。