Using the Elasticsearch scan-and-scroll feature, is it possible to control both the size of the batches returned, as well as the limit on the number of matches?
According to the Elasticsearch scan-and-scroll documentation :
Although we specified a
size
of 1,000, we get back many more documents. When scanning, thesize
is applied to each shard, so you will get back a maximum ofsize * number_of_primary_shards
documents in each batch.
This seems to indicate that the size
parameter is used differently in a scan-and-scroll then it would be used in a query-then-fetch
-type (where it limits the number of matches), and that there is not a "separate knob" that can be specified.
Update
A use case for this is:
0
(or some really high number) to allow the user to eventually page through everything, if necessary Scan-and-scroll seems like a good choice, but perhaps there's a better way to do this?
size
is used differently in scan and scroll. It does limit the number of documents return with each scroll, but you get size * num_of_primary_shards
back.
In general you are correct but you could limit the hits returned using a limit filter (or limit query in 2.0) - seems a little odd though, I'd make sure scan and scroll is the best approach if limiting it in this way is the desired behavior.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.