简体   繁体   English

使用手动快照将数据从一个弹性搜索索引迁移到 AWS 中不同区域的另一个索引

[英]Migration of data from one elastic search index to another in different region in AWS using manual snapshots

I have created two elastic search domains - one in us-east-1 and another in us-west-2.我创建了两个弹性搜索域——一个在 us-east-1 中,另一个在 us-west-2 中。 I have registered manual snapshot repository in us-east-1 domain and have taken snapshot and the data is in s3 bucket in us-east-1.我在 us-east-1 域中注册了手动快照存储库并拍摄了快照,数据在 us-east-1 的 s3 存储桶中。

How should I go about doing the restoration?我应该如何进行恢复?

Main questions:主要问题:

  1. Do I have to do cross-region replication of the s3 bucket to us-west-2, so that everytime a snapshot is taken in us-east-1, it automatically reflects to us-west-2 bucket?我是否必须将 s3 存储桶跨区域复制到 us-west-2,以便每次在 us-east-1 中拍摄快照时,它都会自动反映到 us-west-2 存储桶?

  2. If so, do I have to be in us-west-2 to register manual snapshot repository on the domain and that s3 bucket?如果是这样,我是否必须在 us-west-2 中才能在域和该 s3 存储桶上注册手动快照存储库?

  3. Will the restore API look like this?恢复的API会是这个样子吗? curl -XPOST 'elasticsearch-domain-endpoint-us-west-2/_snapshot/repository-name/snapshot-name/_restore' curl -XPOST 'elasticsearch-domain-endpoint-us-west-2/_snapshot/repository-name/snapshot-name/_restore'

  1. You don't need to create S3 buckets in several regions.您无需在多个区域中创建 S3 存储桶。 Only one is sufficient.只有一个就足够了。 So your S3 repository will be in us-west-2所以您的 S3 存储库将位于 us-west-2

  2. You need to create the snapshot repository in both of your clusters so that you can access it from both sides.您需要在两个集群中创建快照存储库,以便您可以从两侧访问它。 From one cluster you will create snapshots and from the second cluster you'll be able to restore those snapshots.您将从一个集群创建快照,从第二个集群您将能够恢复这些快照。

  3. Yes, that's correct.对,那是正确的。

1.- No, as Val said you don't need to create S3 buckets in several regions. 1.- 不,正如 Val 所说,您不需要在多个区域创建 S3 存储桶。 "all buckets work globally" AWS S3 Bucket with Multiple Regions “所有存储桶在全球范围内工作” 具有多个区域的 AWS S3 存储桶

2.- Yes you do. 2.- 是的。 You need to create the snapshot repository in both of your clusters.您需要在两个集群中创建快照存储库。 One repository for create your snapshot to the S3 bucket in us-east-1 And other for your snaphost in us-west-2, in order to read from your destination cluster.一个存储库用于创建快照到 us-east-1 中的 S3 存储桶,另一个存储库用于创建 us-west-2 中的快照主机,以便从目标集群中读取数据。

3.- Yes It is. 3.- 是的。 Additionally, you need to sign your calls to AWS ES to be able to create the repo and to take the snapshot.此外,您需要签署对 AWS ES 的调用才能创建存储库并拍摄快照。 The best option for me was to use the Python script described below.对我来说最好的选择是使用下面描述的 Python 脚本。 To restore it is not necessary.要恢复它是没有必要的。

Follow this instructions: https://medium.com/docsapp-product-and-technology/aws-elasticsearch-manual-snapshot-and-restore-on-aws-s3-7e9783cdaecb and https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html请按照以下说明操作: https://medium.com/docsapp-product-and-technology/aws-elasticsearch-manual-snapshot-and-restore-on-aws-s3-7e9783cdaecbhttps://docs。 com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html

Create a repository创建存储库

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = 'https://localhost:9999/' # include https:// and trailing / Your elasticsearch endpoint, if you use VPC, you can create a tunnel
region = 'us-east-1' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

path = '_snapshot/yourreponame' # the Elasticsearch API endpoint
url = host + path

payload = {
  "type": "s3",
  "settings": {
    "bucket": "yourreponame_bucket",
    "region": "us-east-1",
    "role_arn": "arn:aws:iam::1111111111111:role/AmazonESSnapshotRole" <-- Don't forget to create the AmazonESSnapshotRole
  }
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers, verify=False)

print(r.status_code)
print(r.text)

Create a snapshot创建快照

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = 'https://localhost:9999/' # include https:// and trailing /
region = 'us-east-1' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

path = '_snapshot/yourreponame/yoursnapshot_name' # the Elasticsearch API endpoint
url = host + path

payload = {
  "indices": "*",
  "include_global_state": "false",
  "ignore_unavailable": "false"
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers, verify=False)

print(r.status_code)
print(r.text)

Restore恢复

Must be called without signing必须在没有签名的情况下调用

curl -XPOST -k "https://localhost:9999/_snapshot/yourreponame/yoursnapshot_name/_restore" \
-H "Content-type: application/json" \
-d $'{
  "indices": "*",
  "ignore_unavailable": false,
  "include_global_state": false,
  "include_aliases": false
}'

It is highly recommended that the clusters have the same version.强烈建议集群具有相同的版本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 AWS Lambda - 将每月快照复制到另一个区域 - AWS Lambda - Copy monthly snapshots to another region 如何使用 aws java sdk 将文件从 S3 存储桶从一个区域复制到另一个区域? - How to copy files from S3 bucket from one region to another region using aws java sdk? 如何为 AWS Elastic 搜索注册手动备份? - How to register manual backup for AWS Elastic search? 将特定rds模式从一个aws区域导出到另一区域 - export particular rds schema from one aws region to another region 使用 Glue 向 AWS Elastic Search 输入数据 - Input data to AWS Elastic Search using Glue 将AWS Elastic beanstalk部署到不同区域的环境 - Deploy AWS Elastic beanstalk to an environment in different region 自定义 AWS DMS 在迁移时如何将数据索引到 Elastic Search - Customize how AWS DMS index data into Elastic Search while migrating AWS ElasticSearch - 自动化手动快照 - AWS ElasticSearch - Automating manual snapshots 将弹性 IP 从一个 AWS 账户移动到另一个 AWS 账户? - Move Elastic IP from one AWS account to another AWS account? 结合使用AWS Lambda和Elastic Search,从搜索客户端获取未定义 - Using AWS Lambda with Elastic Search, getting Undefined from the search client
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM