I'm working with a huge (5 million documents) ElasticSearch database and I need to fetch data using sliced scroll in python. Question is: if there is some way to limit (set size
param) the sliced scroll? I tried to set size
param by [search obj].param(size=500000)
or [:500000]
but it doesn't seem to work - sliced scroll gives me all documents.
In my script, I'm using sliced scroll with python multiprocessing like in here: https://github.com/elastic/elasticsearch-dsl-py/issues/817
Is there some way to get for example 500000 documents using sliced scroll?
Thanks in advance.
Answer from github:
"There is no limit on scroll, it always returns all documents. To only get a subset simply stop consuming the iterator after you get the number you wanted to retrieve by using a break statement or similar."
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.