简体   繁体   English

使用 AWS lambda 将文件从一个 s3 存储桶移动到 AWS 中的另一个存储桶

[英]Move files from one s3 bucket to another in AWS using AWS lambda

I am trying to move files older than a hour from one s3 bucket to another s3 bucket using python boto3 AWS lambda function with following cases:我正在尝试使用 python boto3 AWS lambda function 将超过一小时的文件从一个 s3 存储桶移动到另一个 s3 存储桶,情况如下:

  1. Both buckets can be in same account and different region.两个桶可以在同一个账户和不同的地区。
  2. Both buckets can be in different account and different region.两个桶可以在不同的账户和不同的地区。
  3. Both buckets can be in different account and same region.两个桶可以在不同的账户和相同的区域。

I got some help to move files using the python code mentioned by @John Rotenstein我得到了一些帮助来使用@John Rotenstein 提到的 python 代码移动文件

import boto3
from datetime import datetime, timedelta

SOURCE_BUCKET = 'bucket-a'
DESTINATION_BUCKET = 'bucket-b'

s3_client = boto3.client('s3')

# Create a reusable Paginator
paginator = s3_client.get_paginator('list_objects_v2')

# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket=SOURCE_BUCKET)

# Loop through each object, looking for ones older than a given time period
for page in page_iterator:
    for object in page['Contents']:
        if object['LastModified'] < datetime.now().astimezone() - timedelta(hours=1):   # <-- Change time period here
            print(f"Moving {object['Key']}")

            # Copy object
            s3_client.copy_object(
                Bucket=DESTINATION_BUCKET,
                Key=object['Key'],
                CopySource={'Bucket':SOURCE_BUCKET, 'Key':object['Key']}
            )

            # Delete original object
            s3_client.delete_object(Bucket=SOURCE_BUCKET, Key=object['Key'])

How can this be modified to cater the requirement如何对其进行修改以满足要求

Moving between regions在区域之间移动

This is a non-issue.这不是问题。 You can just copy the object between buckets and Amazon S3 will figure it out.您只需在存储桶之间复制 object,Amazon S3 就会发现。

Moving between accounts在帐户之间移动

This is a bit harder because the code will use a single set of credentials must have ListBucket and GetObject access on the source bucket, plus PutObject rights to the destination bucket.这有点困难,因为代码将使用一组凭证,必须对源存储桶具有ListBucketGetObject访问权限,以及对目标存储桶的PutObject权限。

Also, if credentials are being used from the Source account, then the copy must be performed with ACL='bucket-owner-full-control' otherwise the Destination account won't have access rights to the object.此外,如果从源帐户使用凭据,则必须使用ACL='bucket-owner-full-control'执行复制,否则目标帐户将无权访问 object。 This is not required when the copy is being performed with credentials from the Destination account.当使用来自目标帐户的凭据执行复制时,不需要这样做。

Let's say that the Lambda code is running in Account-A and is copying an object to Account-B .假设 Lambda 代码在Account-A中运行,并将 object 复制到Account-B An IAM Role ( Role-A ) is assigned to the Lambda function. IAM 角色 ( Role-A ) 分配给 Lambda function。 It's pretty easy to give Role-A access to the buckets in Account-A .Role-A访问Account-A中的存储桶非常容易。 However, the Lambda function will need permissions to PutObject in the bucket ( Bucket-B ) in Account-B .但是,Lambda function 将需要Account-B中存储桶 ( Bucket-B ) 中的PutObject权限。 Therefore, you'll need to add a bucket policy to Bucket-B that allows Role-A to PutObject into the bucket.因此,您需要Bucket-B添加一个存储桶策略,以允许Role-APutObject放入存储桶。 This way, Role-A has permission to read from Bucket-A and write to Bucket-B .这样, Role-A有权从Bucket-A读取并写入Bucket-B

So, putting it all together:所以,把它们放在一起:

  • Create an IAM Role ( Role-A ) for the Lambda function为 Lambda function 创建 IAM 角色 ( Role-A )
  • Give the role Read/Write access as necessary for buckets in the same account根据需要为同一账户中的存储桶授予角色读/写访问权限
  • For buckets in other accounts, add a Bucket Policy that grants the necessary access permissions to the IAM Role ( Role-A )对于其他账户中的存储桶,添加向 IAM 角色 ( Role-A ) 授予必要访问权限的存储桶策略
  • In the copy_object() command, include ACL='bucket-owner-full-control' (this is the only coding change needed)copy_object()命令中,包括ACL='bucket-owner-full-control' (这是唯一需要的编码更改)
  • Don't worry about doing any for cross-region, it should just work automatically不要担心跨区域做任何事情,它应该会自动工作

An alternate approach would be to use Amazon S3 Replication , which can replicate bucket contents:另一种方法是使用Amazon S3 Replication ,它可以复制存储桶内容:

  • Within the same region, or between regions同一区域内或区域之间
  • Within the same AWS Account, or between different Accounts在同一个 AWS 账户内,或在不同账户之间

Replication is frequently used when organizations need another copy of their data in a different region, or simply for backup purposes.当组织需要在不同区域的另一个数据副本或仅出于备份目的时,经常使用复制。 For example, critical company information can be replicated to another AWS Account that is not accessible to normal users.例如,可以将关键公司信息复制到普通用户无法访问的另一个 AWS 账户。 This way, if some data was deleted, there is another copy of it elsewhere.这样,如果某些数据被删除,其他地方就会有另一个副本。

Replication requires versioning to be activated on both the source and destination buckets.复制需要在源存储桶和目标存储桶上激活版本控制。 If you require encryption, use standard Amazon S3 encryption options .如果您需要加密,请使用标准Amazon S3 加密选项 The data will also be encrypted during transit.数据在传输过程中也将被加密。

You configure a source bucket and a destination bucket, then specify which objects to replicate by providing a prefix or a tag.您配置源存储桶和目标存储桶,然后通过提供前缀或标签来指定要复制的对象。 Objects will only be replicated once Replication is activated.仅在激活复制后才会复制对象。 Existing objects will not be copied.不会复制现有对象。 Deletion is intentionally not replicated to avoid malicious actions.有意复制删除以避免恶意操作。 See: What Does Amazon S3 Replicate?请参阅: Amazon S3 复制什么?

There is no "additional" cost for S3 replication, but you will still be charge for any Data Transfer charges when moving objects between regions, and for API Requests (that are tiny charges), plus storage of course. S3 复制没有“额外”成本,但在区域之间移动对象时,您仍然需要支付任何数据传输费用,以及 API 请求(这些费用很小),当然还有存储费用。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用AWS Lambda中的Boto3将一个帐户中的S3存储桶复制到另一个帐户中的S3存储桶 - Copy from S3 bucket in one account to S3 bucket in another account using Boto3 in AWS Lambda Aws 使用 lambda python 将数据从一个 S3 存储桶复制到同一帐户上的另一个存储桶 - Aws copy data from one S3 bucket to another on same account using lambda python Python 中的 AWS Lambda 将新文件复制到另一个 s3 存储桶 - AWS Lambda in Python to copy new files to another s3 bucket 如何使用 AWS Lambda 函数将文件从 AWS S3 存储桶复制到 EC2 linux 机器 - How to copy files from AWS S3 bucket to EC2 linux machine using AWS Lambda Functions AWS Lambda:如何读取 S3 存储桶中的 CSV 文件然后将其上传到另一个 S3 存储桶? - AWS Lambda: How to read CSV files in S3 bucket then upload it to another S3 bucket? 使用 Paramiko 将文件从 AWS S3 存储桶复制到 SFTP - Copy files from AWS S3 bucket to SFTP using Paramiko 使用 NiFi 执行脚本处理器将文件从 aws s3 位置移动到另一个 aws s3 位置 - move files from aws s3 loaction to another aws s3 location using NiFi execute script processor 在 AWS S3 上将数据从一个文件夹移动/复制到另一个文件夹 - Move/copy data from one folder to another on AWS S3 创建 AWS lambda function 以在 s3 存储桶中拆分 pdf 个文件 - Creating an AWS lambda function to split pdf files in a s3 bucket 通过 lambda function 的 AWS S3 存储桶中的 Readming.tif 文件,这些文件包含在另一个目录中 - Readming .tif files in AWS S3 bucket through lambda function which are contained in another directory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM