[英]Move files from one s3 bucket to another in AWS using AWS lambda
I am trying to move files older than a hour from one s3 bucket to another s3 bucket using python boto3 AWS lambda function with following cases:我正在尝试使用 python boto3 AWS lambda function 将超过一小时的文件从一个 s3 存储桶移动到另一个 s3 存储桶,情况如下:
I got some help to move files using the python code mentioned by @John Rotenstein我得到了一些帮助来使用@John Rotenstein 提到的 python 代码移动文件
import boto3
from datetime import datetime, timedelta
SOURCE_BUCKET = 'bucket-a'
DESTINATION_BUCKET = 'bucket-b'
s3_client = boto3.client('s3')
# Create a reusable Paginator
paginator = s3_client.get_paginator('list_objects_v2')
# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket=SOURCE_BUCKET)
# Loop through each object, looking for ones older than a given time period
for page in page_iterator:
for object in page['Contents']:
if object['LastModified'] < datetime.now().astimezone() - timedelta(hours=1): # <-- Change time period here
print(f"Moving {object['Key']}")
# Copy object
s3_client.copy_object(
Bucket=DESTINATION_BUCKET,
Key=object['Key'],
CopySource={'Bucket':SOURCE_BUCKET, 'Key':object['Key']}
)
# Delete original object
s3_client.delete_object(Bucket=SOURCE_BUCKET, Key=object['Key'])
How can this be modified to cater the requirement如何对其进行修改以满足要求
This is a non-issue.这不是问题。 You can just copy the object between buckets and Amazon S3 will figure it out.
您只需在存储桶之间复制 object,Amazon S3 就会发现。
This is a bit harder because the code will use a single set of credentials must have ListBucket
and GetObject
access on the source bucket, plus PutObject
rights to the destination bucket.这有点困难,因为代码将使用一组凭证,必须对源存储桶具有
ListBucket
和GetObject
访问权限,以及对目标存储桶的PutObject
权限。
Also, if credentials are being used from the Source account, then the copy must be performed with ACL='bucket-owner-full-control'
otherwise the Destination account won't have access rights to the object.此外,如果从源帐户使用凭据,则必须使用
ACL='bucket-owner-full-control'
执行复制,否则目标帐户将无权访问 object。 This is not required when the copy is being performed with credentials from the Destination account.当使用来自目标帐户的凭据执行复制时,不需要这样做。
Let's say that the Lambda code is running in Account-A
and is copying an object to Account-B
.假设 Lambda 代码在
Account-A
中运行,并将 object 复制到Account-B
。 An IAM Role ( Role-A
) is assigned to the Lambda function. IAM 角色 (
Role-A
) 分配给 Lambda function。 It's pretty easy to give Role-A
access to the buckets in Account-A
.让
Role-A
访问Account-A
中的存储桶非常容易。 However, the Lambda function will need permissions to PutObject
in the bucket ( Bucket-B
) in Account-B
.但是,Lambda function 将需要
Account-B
中存储桶 ( Bucket-B
) 中的PutObject
权限。 Therefore, you'll need to add a bucket policy to Bucket-B
that allows Role-A
to PutObject
into the bucket.因此,您需要向
Bucket-B
添加一个存储桶策略,以允许Role-A
将PutObject
放入存储桶。 This way, Role-A
has permission to read from Bucket-A
and write to Bucket-B
.这样,
Role-A
有权从Bucket-A
读取并写入Bucket-B
。
So, putting it all together:所以,把它们放在一起:
Role-A
) for the Lambda functionRole-A
)Role-A
)Role-A
) 授予必要访问权限的存储桶策略copy_object()
command, include ACL='bucket-owner-full-control'
(this is the only coding change needed)copy_object()
命令中,包括ACL='bucket-owner-full-control'
(这是唯一需要的编码更改)An alternate approach would be to use Amazon S3 Replication , which can replicate bucket contents:另一种方法是使用Amazon S3 Replication ,它可以复制存储桶内容:
Replication is frequently used when organizations need another copy of their data in a different region, or simply for backup purposes.当组织需要在不同区域的另一个数据副本或仅出于备份目的时,经常使用复制。 For example, critical company information can be replicated to another AWS Account that is not accessible to normal users.
例如,可以将关键公司信息复制到普通用户无法访问的另一个 AWS 账户。 This way, if some data was deleted, there is another copy of it elsewhere.
这样,如果某些数据被删除,其他地方就会有另一个副本。
Replication requires versioning to be activated on both the source and destination buckets.复制需要在源存储桶和目标存储桶上激活版本控制。 If you require encryption, use standard Amazon S3 encryption options .
如果您需要加密,请使用标准Amazon S3 加密选项。 The data will also be encrypted during transit.
数据在传输过程中也将被加密。
You configure a source bucket and a destination bucket, then specify which objects to replicate by providing a prefix or a tag.您配置源存储桶和目标存储桶,然后通过提供前缀或标签来指定要复制的对象。 Objects will only be replicated once Replication is activated.
仅在激活复制后才会复制对象。 Existing objects will not be copied.
不会复制现有对象。 Deletion is intentionally not replicated to avoid malicious actions.
有意不复制删除以避免恶意操作。 See: What Does Amazon S3 Replicate?
请参阅: Amazon S3 复制什么?
There is no "additional" cost for S3 replication, but you will still be charge for any Data Transfer charges when moving objects between regions, and for API Requests (that are tiny charges), plus storage of course. S3 复制没有“额外”成本,但在区域之间移动对象时,您仍然需要支付任何数据传输费用,以及 API 请求(这些费用很小),当然还有存储费用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.