Recently we switched our cluster to EC2 and everything is working great... except percolation :(
We use Elasticsearch 2.2.0. To reindex (and percolate) our data we use a separate EC2 c3.8xlarge instance (32 cores, 60GB, 2 x 160 GB SSD) and tell our index to include only this node in allocation. Because we'll distribute it amongst the rest of the nodes later, we use 10 shards, no replicas (just for indexing and percolation). There are about 22 million documents in the index and 15.000 percolators. The index is a tad smaller than 11GB (and so easily fits into memory). About 16 php processes talk to the REST API doing multi percolate requests with 200 requests in each (we made it smaller because of the performance, it was 1000 per request before).
One percolation request (a real one, tapped off of the php processes running) is taking around 2m20s under load (of the 16 php processes). That would've been ok if one of the resources on the EC2 was maxed out but that's the strange thing (see stats output here but also seen on htop, iotop and iostat): load, cpu, memory, heap, io; everything is well (very well) within limits. There doesn't seem to be a shortage of resources but still, percolation performance is bad.
When we back off the php processes and try the percolate request again, it comes out at around 15s. Just to be clear: I don't have a problem with a 2min+ multi percolate request. As long as I know that one of the resources is fully utilized (and I can act upon it by giving it more of what it wants).
So, ok, it's not the usual suspects, let's try different stuff:
processors
configuration in elasticsearch.yml
and restarted the node to fake our way to a higher usage of resources: no change. percolate
and get
pool size and queue size: no change. DiscovereUsageTrackingQueryCachingPolicy
was coming up a lot so we did as suggested in this issue : no change. This last point might be interesting! It seems that it's stuck somewhere for a loooong time and then goes through the percolate phase quite smoothly.
Can anyone make anything out of this? Anything we have missed? We can provide more data if needed.
Have a look at the thread on the Elastic Discuss forum to see the solution.
TLDR; Use multiple nodes on one big server to get better resource utilization.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.