[英]aws s3 restore all files of a folder
I have files archived on aws s3 glacier deep archive.我将文件存档在 aws s3 冰川深度存档中。 I want to initiate the restoration of all objects starting with a prefix.
我想启动以前缀开头的所有对象的恢复。
For that i first try to use de aws cli with this command:为此,我首先尝试将 de aws cli 与此命令一起使用:
aws s3api list-objects-v2 \
--bucket ${bucket} \
--prefix "${prefix}" \
--query "Contents[?StorageClass=='DEEP_ARCHIVE'].Key" \
--output text \
| sed 's/\t/\n/g' \
| xargs -I %%% \
aws s3api restore-object \
--restore-request Days=${days},GlacierJobParameters={"Tier"=\""${mode}"\"} \
--bucket ${bucket} \
--key "%%%"
I don't know why but some objects have initiated a restoration but others (the majority) have not.我不知道为什么,但有些对象已经开始恢复,但其他对象(大多数)没有。
So then i try to use python with the following code:那么我尝试使用 python 和以下代码:
def restore_object(bucket,prefix,days,tier):
s3 = boto3.resource('s3')
client = boto3.client('s3')
my_bucket = s3.Bucket(bucket)
logfile = open("restoration.log","w")
for object in my_bucket.objects.filter(Prefix=prefix):
if object.storage_class == "DEEP_ARCHIVE":
try:
resp = client.restore_object(
Bucket=bucket,
Key=object.key,
RestoreRequest={
'Days' : days,
'GlacierJobParameters' : {'Tier' : tier}
}
)
except Exception as e:
logfile.write(f'For the object {object.key}, {e} \n')
But it's very long.但是它很长。 4 hours after the script is still running and many objects have still not initiated the restoration.
脚本仍在运行 4 小时后,许多对象仍未启动恢复。 There are about 70 000 objects in this folder.
此文件夹中大约有 70 000 个对象。
As @John suggested i finally used a batch operation.正如@John 建议的那样,我终于使用了批处理操作。 For people interested here the code:
对于这里感兴趣的人代码:
#!/usr/bin/env python3.9
import argparse
from urllib.parse import quote_plus
import os
import boto3
def get_arguments():
parser = argparse.ArgumentParser(
description='''
Restoration of objects from glacier deep archive on AWS s3
This script create a manifest file with all object to restore
and upload this manifest to s3.
Then a s3 batch job is created and run
''',
formatter_class=argparse.RawTextHelpFormatter,
usage='use "%(prog)s --help" for more information',)
parser.add_argument(
'--bucket',
nargs=1,
help='bucket name (default: %(default)s)',
default='my-bucket',
required=False)
parser.add_argument(
'--prefix',
type=str,
help='<Required> path of the folder to restore (without the name of the bucket)',
required=True)
parser.add_argument(
'--days',
type=int,
help='number of days before deletion of the restored object copy (default: %(default)s)',
default=2,
required=False)
parser.add_argument(
'--mode',
choices=['STANDARD','BULK'],
default='STANDARD',
help='''
Acces tier option.
Standard = restoration in 12h
Bulk = restoration in 48h
(default: %(default)s)
''',
required=False)
return parser.parse_args()
def create_manifest(bucket,prefix):
'''Create a manifest file with all object to restore and upload this manifest to s3
Parameters
----------
bucket : str
name of the bucket
prefix : str
path of the folder to restore (without the name of the bucket)
Returns
-------
manifest_object
object of the manifest file
'''
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(bucket)
prefix_file_name = prefix.replace("/","_") if "/" in prefix else prefix
manifest = open(prefix_file_name, "w")
logfile = open("restoration.log","w")
for object in my_bucket.objects.filter(Prefix=prefix):
if object.storage_class == "DEEP_ARCHIVE":
try:
key_url_encode = quote_plus(object.key, safe = '/')
manifest.write(f'{bucket},{key_url_encode}\n')
except Exception as e:
logfile.write(f'For the object {object.key}, {e} \n')
manifest.close()
if prefix.endswith('/'):
manifest_key = prefix + 'Manifest2/Manifest.csv'
else:
manifest_key = prefix + '/Manifest2/Manifest.csv'
my_bucket.upload_file(
Filename=prefix_file_name,
Key=manifest_key
)
os.remove(prefix_file_name)
return s3.Object(bucket,manifest_key)
def create_job(manifest_object,bucket,days,tiers,prefix):
'''Create a s3 batch job
Parameters
----------
manifest_object : object
object of the manifest file
bucket : str
name of the bucket
days : int
number of days before deletion of the restored object copy
tiers : str
Acces tier option.
prefix : str
path of the folder to restore (without the name of the bucket)
Returns
-------
None
'''
clientjobs = boto3.client('s3control')
manifest_arn = 'arn:aws:s3:::' + bucket + '/' + manifest_object.key
bucket_arn = 'arn:aws:s3:::' + bucket
if prefix.endswith('/'):
report_key = prefix + 'report_batch_jobs'
else:
report_key = prefix + '/report_batch_jobs'
response = clientjobs.create_job(
AccountId='myaccountid',
ConfirmationRequired=False,
Operation={
'S3InitiateRestoreObject': {
'ExpirationInDays': days,
'GlacierJobTier': tiers
}
},
Report={
'Bucket': bucket_arn,
'Format': 'Report_CSV_20180820',
'Enabled': True,
'Prefix': report_key,
'ReportScope': 'FailedTasksOnly'
},
Manifest={
'Spec': {
'Format': 'S3BatchOperations_CSV_20180820',
'Fields': [
'Bucket','Key'
]
},
'Location': {
'ObjectArn': manifest_arn,
'ETag': manifest_object.e_tag
}
},
Priority=1,
RoleArn='arnofiamrole'
)
def main():
bucket = get_arguments().bucket
prefix = get_arguments().prefix
manifest_object = create_manifest(bucket,prefix)
days = get_arguments().days
tiers = get_arguments().mode
create_job(manifest_object,bucket,days,tiers,prefix)
main()
i just remove personal information (account id and iam role ARN)我只是删除个人信息(帐户 ID 和 iam 角色 ARN)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.