简体   繁体   中英

What is the best approach to use remote reindexing API of elasticcluster when migration millions of documents?

I have approx. 100million documents in an index and i want to migrate it to new cluster using reindex API. I want to do it in the throttling manner.

I tried using request_per_seconds to 100000 but it will take hours to complete whole process. Q.1 Can i use request_per_seconds to maybe 1000000 to reduce process time? Q.2 Is there any better approach i can use for better reindexing in throttling manner?

Reindex supports Sliced scroll to parallelize the reindexing process. This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.

POST _reindex?slices=5&refresh
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-automatic-slice

You can also read about the advice for optimizing for speed, things like:

  • Disabling refresh for that period
  • Reduce replicas to 0 etc..

Link:

https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM