简体   繁体   中英

ElasticSearch sliced scroll limit (python)

I'm working with a huge (5 million documents) ElasticSearch database and I need to fetch data using sliced scroll in python. Question is: if there is some way to limit (set size param) the sliced scroll? I tried to set size param by [search obj].param(size=500000) or [:500000] but it doesn't seem to work - sliced scroll gives me all documents.

In my script, I'm using sliced scroll with python multiprocessing like in here: https://github.com/elastic/elasticsearch-dsl-py/issues/817

Is there some way to get for example 500000 documents using sliced scroll?

Thanks in advance.

Answer from github:

"There is no limit on scroll, it always returns all documents. To only get a subset simply stop consuming the iterator after you get the number you wanted to retrieve by using a break statement or similar."

https://github.com/elastic/elasticsearch-dsl-py/issues/817

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM