简体   繁体   中英

Solr performance for large index with 4 servers

We have 4 servers (2 servers of 48GB RAM, 24 cores, 2.4GHz && 2 servers of 64 GB RAM, 24 cores, 2.4GHz). We are using 4 shards (1 shard on each server). Each shard index size is about 500GB.

We are using edismax parser && surround query parser to handle phrase, proximity & wild card searches.

Even a simple wildcard/proximity search is taking 10-20 seconds.

We have the same setup on single server (24 cores, 64 GB RAM, 2.4GHz) with 8 shards (each shard index size is 250GB)

The performance of single server setup is almost 2 times (better) compared to the 4 servers setup.

We had setup the 4 servers solr cloud to improve the performance but the performance decreased. Is there anything that we might be missing here?

This question looks like a sister to CPU usage when searching using solr and the problem is the same: You are CPU-bound as your queries are very heavy. Your queries are matched against each shard in a single-threaded manner, so your 4 machine setup means that you have 4 threads working on 500GB of index each, while your single machine setup has 8 threads working on 250GB of index each. As you have more than enough CPU cores, the setup with the smaller shards will finish first.

If you split the shards further to eg 50GB each, you will have 40 shards. If you split them along the 4 machines with 10 shards/machine, you can support 2 (in reality more like 3) concurrent requests at full CPU speed. Ideally that should give you 5 times the speed of your single machine setup.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM