I am following the AWS documentation for "Choosing the number of shards" for an Elasticsearch Index.
My Read TPS for the ES Index will be very high (around 1300 TPS, and can increase to 6500 TPS), but the amount of data which will be present will be very less (lesser than a GB).
Questions:
In Elasticsearch, each query is executed in a single thread per shard. Multiple shards can however be processed in parallel, as can multiple queries and aggregations against the same shard.
In Elasticsearch, each query is executed in a single thread per shard. Multiple shards can however be processed in parallel, as can multiple queries and aggregations against the same shard.
. If the above understanding is correct, all the requests will be single threaded on a single data node, if I only have one shard. The horizontal scaling thus cannot be implemented.Since the data size is small, and you need a very high throughput, I would opt to have 1 primary and as many replicas as the number of nodes - 1 (which will hold the primary). Now the number of nodes depends. You'll have to test, but you could go with 3 nodes (which is a common resilient/performant first setup). So 1 primary and 2 replicas in total. Check with that setup and try stress testing it.
For the stress test you can use rally , which is the framework that elasticsearch is using when testing new releases.
It's an interesting scenario, and yeah most of the information provided is quite good, just wanted to add below points:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.