[英]Migration of data from one elastic search index to another in different region in AWS using manual snapshots
I have created two elastic search domains - one in us-east-1 and another in us-west-2.我创建了两个弹性搜索域——一个在 us-east-1 中,另一个在 us-west-2 中。 I have registered manual snapshot repository in us-east-1 domain and have taken snapshot and the data is in s3 bucket in us-east-1.
我在 us-east-1 域中注册了手动快照存储库并拍摄了快照,数据在 us-east-1 的 s3 存储桶中。
How should I go about doing the restoration?我应该如何进行恢复?
Main questions:主要问题:
Do I have to do cross-region replication of the s3 bucket to us-west-2, so that everytime a snapshot is taken in us-east-1, it automatically reflects to us-west-2 bucket?我是否必须将 s3 存储桶跨区域复制到 us-west-2,以便每次在 us-east-1 中拍摄快照时,它都会自动反映到 us-west-2 存储桶?
If so, do I have to be in us-west-2 to register manual snapshot repository on the domain and that s3 bucket?如果是这样,我是否必须在 us-west-2 中才能在域和该 s3 存储桶上注册手动快照存储库?
Will the restore API look like this?恢复的API会是这个样子吗? curl -XPOST 'elasticsearch-domain-endpoint-us-west-2/_snapshot/repository-name/snapshot-name/_restore'
curl -XPOST 'elasticsearch-domain-endpoint-us-west-2/_snapshot/repository-name/snapshot-name/_restore'
You don't need to create S3 buckets in several regions.您无需在多个区域中创建 S3 存储桶。 Only one is sufficient.
只有一个就足够了。 So your S3 repository will be in us-west-2
所以您的 S3 存储库将位于 us-west-2
You need to create the snapshot repository in both of your clusters so that you can access it from both sides.您需要在两个集群中创建快照存储库,以便您可以从两侧访问它。 From one cluster you will create snapshots and from the second cluster you'll be able to restore those snapshots.
您将从一个集群创建快照,从第二个集群您将能够恢复这些快照。
Yes, that's correct.对,那是正确的。
1.- No, as Val said you don't need to create S3 buckets in several regions. 1.- 不,正如 Val 所说,您不需要在多个区域创建 S3 存储桶。 "all buckets work globally" AWS S3 Bucket with Multiple Regions
“所有存储桶在全球范围内工作” 具有多个区域的 AWS S3 存储桶
2.- Yes you do. 2.- 是的。 You need to create the snapshot repository in both of your clusters.
您需要在两个集群中创建快照存储库。 One repository for create your snapshot to the S3 bucket in us-east-1 And other for your snaphost in us-west-2, in order to read from your destination cluster.
一个存储库用于创建快照到 us-east-1 中的 S3 存储桶,另一个存储库用于创建 us-west-2 中的快照主机,以便从目标集群中读取数据。
3.- Yes It is. 3.- 是的。 Additionally, you need to sign your calls to AWS ES to be able to create the repo and to take the snapshot.
此外,您需要签署对 AWS ES 的调用才能创建存储库并拍摄快照。 The best option for me was to use the Python script described below.
对我来说最好的选择是使用下面描述的 Python 脚本。 To restore it is not necessary.
要恢复它是没有必要的。
Follow this instructions: https://medium.com/docsapp-product-and-technology/aws-elasticsearch-manual-snapshot-and-restore-on-aws-s3-7e9783cdaecb and https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html请按照以下说明操作: https://medium.com/docsapp-product-and-technology/aws-elasticsearch-manual-snapshot-and-restore-on-aws-s3-7e9783cdaecb和https://docs。 com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html
Create a repository创建存储库
import boto3
import requests
from requests_aws4auth import AWS4Auth
host = 'https://localhost:9999/' # include https:// and trailing / Your elasticsearch endpoint, if you use VPC, you can create a tunnel
region = 'us-east-1' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
path = '_snapshot/yourreponame' # the Elasticsearch API endpoint
url = host + path
payload = {
"type": "s3",
"settings": {
"bucket": "yourreponame_bucket",
"region": "us-east-1",
"role_arn": "arn:aws:iam::1111111111111:role/AmazonESSnapshotRole" <-- Don't forget to create the AmazonESSnapshotRole
}
}
headers = {"Content-Type": "application/json"}
r = requests.put(url, auth=awsauth, json=payload, headers=headers, verify=False)
print(r.status_code)
print(r.text)
Create a snapshot创建快照
import boto3
import requests
from requests_aws4auth import AWS4Auth
host = 'https://localhost:9999/' # include https:// and trailing /
region = 'us-east-1' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
path = '_snapshot/yourreponame/yoursnapshot_name' # the Elasticsearch API endpoint
url = host + path
payload = {
"indices": "*",
"include_global_state": "false",
"ignore_unavailable": "false"
}
headers = {"Content-Type": "application/json"}
r = requests.put(url, auth=awsauth, json=payload, headers=headers, verify=False)
print(r.status_code)
print(r.text)
Restore恢复
Must be called without signing必须在没有签名的情况下调用
curl -XPOST -k "https://localhost:9999/_snapshot/yourreponame/yoursnapshot_name/_restore" \
-H "Content-type: application/json" \
-d $'{
"indices": "*",
"ignore_unavailable": false,
"include_global_state": false,
"include_aliases": false
}'
It is highly recommended that the clusters have the same version.强烈建议集群具有相同的版本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.