简体繁体中英

Elasticsearch 'size:' vs MongoDB batch_size

原文 2017-05-04 13:52:41 4 1 mongodb/ elasticsearch

For my thesis I'm currently investigating the speed (down to milliseconds) of Elasticsearch and MongoDB.

I've noticed that, compared to MongoDB, Elasticsearch is very consistent when it comes to the speed at which it returns data and the total items found. Where other MongoDB takes a longer time to return data the more results are found, Elasticsearch's response time is almost always the same, regardless of the total amount of requests sent.

My hypothesis is that in Elasticsearch, when using the size operator, the number of documents that are actually looked up and retrieved after the search in the indexes is finished is exactly the amount set in the size operator. Where in MongoDB this is not the case, in MongoDB all documents that matched in the index are retrieved, and only the top X amount is eventually returned to the client based on the cursor's batch_size and eventually the max limit() that is set.

I have no way, other than to spend hours looking through the source code, to figure out if this hypothesis is correct, or if something else is going on that I must have missed.

Thanks for taking the time to read this, any responses are appreciated and will help me further my research.

1 answers

To make it a bit clearer how Elasticsearch actually retrieves results: It uses query then fetch .

So if you search for N results, the first phase will query all the shards involved and return a list of their N results containing the score and the ID — not other information. In the second phase you fetch the top N global results by their ID. So you will retrieve more scores and IDs than you need, but you will only fetch the actual results.

PyMongo cursor batch_size

Getting StopIteration exception on next(cursor) when modifying batch_size in pymongo

Is there a default batch size that is used by MongoDB in the Bulk API?

Overridding default batch size in Mongodb connection

Default batch size in aggregate command in MongoDB

What is MongoDB batch operation max size?

mongodb snappy compression Data Size vs Storage Size

MongoDB Pool size vs Pool timeout

Understanding MongoDB storage size, vs disc space, vs database size, in MongoDB Atlas

MongoDB vs ElasticSearch

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question PyMongo cursor batch_size Getting StopIteration exception on next(cursor) when modifying batch_size in pymongo Is there a default batch size that is used by MongoDB in the Bulk API? Overridding default batch size in Mongodb connection Default batch size in aggregate command in MongoDB What is MongoDB batch operation max size? mongodb snappy compression Data Size vs Storage Size MongoDB Pool size vs Pool timeout Understanding MongoDB storage size, vs disc space, vs database size, in MongoDB Atlas MongoDB vs ElasticSearch

Related Tags

Elasticsearch 'size:' vs MongoDB batch_size

Question

1 answers

solution1 1 ACCPTED 2017-05-06 09:23:16

solution1
1 ACCPTED 2017-05-06 09:23:16