简体   繁体   中英

In ElasticSearch ver_6.2.4, how large can scroll_size be?

I'm using update_by_query to update a whole index fields, it may be 30,000,000 rows or even larger in the future, I read the document about this parameter, and I knew it's 1K default, but I didn't see any documents about it.

So the question is, * how large can scroll_size be? * will it takes more memories when it's larger? * if it does take more memories, are there any replacements?

My function:

POST /myIndex/myType/_update_by_query?conflicts=proceed&scroll_size=20000
json
{
    "script": {
        "source": "ctx._source['toUserNickname'] = 'test'",
        "lang": "painless"
     },
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "toUserId": "111"
                    }
                }
            ]
        }
    }
}

There is no max - there are various variables you can adjust to ensure that it doesnt take up too much memory/time.

reading up on "pagination" will be helpful - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html

Another similar question: Max scrollable time for elasticsearch

Alternative: parallel scanning - https://hackernoon.com/parallel-scan-scroll-an-elasticsearch-index-db02583d10d1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM