[英]Elasticsearch too many running threads
We have a big problem with our ES cluster. 我们的ES集群存在很大问题。 One of our nodes is always on 99% CPU.
我们的一个节点始终使用99%的CPU。 For some reason it has about 3 times more threads running for the
elasticsearch
process compared to normal node. 由于某些原因,与普通节点相比,它的
elasticsearch
过程运行的线程数大约是3倍。 I have attached 2 htop
screenshots for 2 nodes, one overloaded and another normal. 我已经为2个节点附加了2个
htop
屏幕截图,一个是重载的,另一个是正常的。 Please advise! 请指教!
Thank you! 谢谢!
Overloaded Node 重载节点
Normal Node 普通节点
UPDATE UPDATE
Cluster architecture: 集群架构:
11 nodes, 2 dedicated masters, 9 data nodes. 11个节点,2个专用主站,9个数据节点。
Nodes Hardware Properties 节点硬件属性
Masters: 大师赛:
Slaves: 从站:
Documents in cluster: 集群中的文档:
~200 Millions ~200万
Index conf: 指数conf:
Each index is split in 10 shards (5 primary, 5 replica) 每个索引分为10个分片(5个主分片,5个副本)
Queries: 查询:
Search RT: ~ 250/s
, Index RT: ~ 6K/s
搜索RT:
~ 250/s
,索引RT: ~ 6K/s
OS OS
Ubuntu 12.04.4 LTS
JAVA JAVA
java version "1.7.0_60" Java(TM) SE Runtime Environment (build 1.7.0_60-b19) Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode)
Figured it out. 弄清楚了。
[2014-07-07 13:38:42,521][DEBUG][index.search.slowlog.query] [n013.my_cluster] [my_index][3] took[2s], took_millis[2066], types[my_type], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"size":20,"from":0,"sort":{"_score":"desc"},"query":{"filtered":{"query":{"query_string":{"query":"my eight words space separated query","fields":["description","tags"],"default_operator":"OR"}},"filter":{"and":[{"range":{"ats":{"lte":1404730800}}},{"terms":{"aid":[1,2,4]}}]},"_cache":false}}}], extra_source[]
The problem resided inside "filter": {"and": ...}
, looks like these kind of queries are heavier for ES compared to bool
type queries. 问题出现在
"filter": {"and": ...}
,与bool
类型查询相比,ES的这类查询看起来更重。 So whenever you want to apply some filters
, please use bool
filters ( must
, must_not
and should
) 因此,无论何时您想应用某些
filters
,请使用bool
过滤器( must
, must_not
和should
)
Reff: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html Reff: http ://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html
Cheers! 干杯!
Based on the sparse info at hand, I have a couple of guesses that could potentially be the problem: 基于手头的稀疏信息,我有几个可能是问题的猜测:
Shards are not well balanced and you are having hot spotting. 碎片不平衡,你有热点。 Ensure that your most heavily used indexes are sharded in such a way that each machine can do its share of work.
确保以最常用的索引进行分片,以便每台计算机都可以完成其工作。 Also, look into the index level "index.routing.allocation.total_shards_per_node" to try to force an equal balance.
另外,查看索引级别“index.routing.allocation.total_shards_per_node”以尝试强制平衡。
Perhaps on the search side, you are specifying that the search should always go to the "primary" shard. 也许在搜索方面,您指定搜索应始终转到“主”分片。 The primary designation isn't something that balances, so basically, the first node up has the primary shard and the others that come up after are all secondaries.
主要指定不是平衡的东西,所以基本上,第一个节点具有主要分片,而其他后面出现的是所有辅助分片。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.