简体   繁体   English

如何使用 boto3 在 2 个不同帐户的 S3 存储桶之间复制文件

[英]How to copy files between S3 buckets in 2 different accounts using boto3

I'm trying to files from a vendors S3 bucket to my S3 bucket using boto3.我正在尝试使用 boto3 将文件从供应商 S3 存储桶传输到我的 S3 存储桶。 I'm using the sts service to assume a role to access the vendor s3 bucket.我正在使用 sts 服务承担访问供应商 s3 存储桶的角色。 I'm able to connect to the vendor bucket and get a listing of the bucket.我能够连接到供应商存储桶并获取该存储桶的列表。 I run into CopyObject operation: Access Denied error when copying to my bucket.复制到我的存储桶时遇到CopyObject operation: Access Denied错误。 Here is my script这是我的脚本

session = boto3.session.Session(profile_name="s3_transfer")
sts_client = session.client("sts", verify=False)
assumed_role_object = sts_client.assume_role(
    RoleArn="arn:aws:iam::<accountid>:role/assumedrole",
    RoleSessionName="transfer_session",
    ExternalId="<ID>",
    DurationSeconds=18000,
)

creds = assumed_role_object["Credentials"]
src_s3 = boto3.client(
    "s3",
    aws_access_key_id=creds["AccessKeyId"],
    aws_secret_access_key=creds["SecretAccessKey"],
    aws_session_token=creds["SessionToken"],
    verify=False,
)
paginator =src_s3.get_paginator("list_objects_v2")
# testing with just 2 items.
# TODO: Remove MaxItems once script works.
pages = paginator.paginate(
    Bucket="ven_bucket", Prefix="client", PaginationConfig={"MaxItems": 2, "PageSize": 1000}
)
dest_s3 = session.client("s3", verify=False)
for page in pages:
    for obj in page["Contents"]:
        src_key = obj["Key"]
        des_key = dest_prefix + src_key[len(src_prefix) :]
        src = {"Bucket": "ven_bucket", "Key": src_key}
        print(src)
        print(des_key)
        dest_s3.copy(src, "my-bucket", des_key, SourceClient=src_s3)

The line dest_s3.copy... is where I get the error. dest_s3.copy...行是我收到错误的地方。 I have the following policy of my aws user to allow copy to my bucket我的 aws 用户有以下政策允许复制到我的存储桶

{
    "Version": "2012-10-17",
   "Statement": [
    {
        "Sid": "VisualEditor1",
        "Effect": "Allow",
        "Action": [
            "s3:*"
        ],
        "Resource": [
            "arn:aws:s3:::my-bucket/*",
            "arn:aws:s3:::my-bucket/"
        ]
    }
    ]
}

I get the following error when running the above script.运行上述脚本时出现以下错误。

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the CopyObject operation: Access Denied

The CopyObject() command can be used to copy objects between buckets without having to upload/download. CopyObject()命令可用于在存储桶之间复制对象,而无需上载/下载。 Basically, the two S3 buckets communicate with each other and transfer the data. 基本上,两个S3存储桶彼此通信并传输数据。

This command can also be used to copy between buckets that in different regions and different AWS accounts. 此命令还可用于在不同区域和不同AWS账户的存储桶之间进行复制。

If you wish to copy between buckets that belong to different AWS accounts , then you will need to use a single set of credentials that have: 如果您希望在属于不同AWS账户的存储桶之间进行复制,那么您将需要使用一组具有以下身份的凭证

  • GetObject permission on the source bucket 源存储桶上的GetObject权限
  • PutObject permission on the destination bucket 目标存储桶上的PutObject权限

Also, please note that the CopyObject() command is sent to the destination account . 另外,请注意, CopyObject()命令已发送到目标帐户 The destination bucket effectively pulls the objects from the source bucket . 目标存储桶有效地从源存储桶中拉出了对象

From your description, your code is assuming a role from the other account to gain read permission on the source bucket. 根据您的描述,您的代码将承担其他帐户的角色,以获取对源存储桶的读取权限。 Unfortunately, this is not sufficient for the CopyObject() command because the command must be sent to the destination bucket. 不幸的是,这对于CopyObject()命令是不够的,因为该命令必须发送到目标存储桶。 (Yes, it is a little hard to discern this from the documentation. That is why the source bucket is specifically named, rather than the destination bucket.) (是的,很难从文档中辨别出来。这就是为什么要专门命名源存储桶,而不是目标存储桶的原因。)

Therefore, in your situation, to be able to copy the objects, you will need to use a set of credentials from Account-B (the destination) that also has permission to read from Bucket-A (the source). 因此,在您的情况下,要能够复制对象,您将需要使用来自Account-B (目标)的一组凭据,该凭据也具有从Bucket-A (源)中读取的权限。 This will require the vendor to modify the Bucket Policy associated with Bucket-A . 这将要求供应商修改Bucket-A关联的Bucket策略

If they do not wish to do this, then your only option is to download the objects using the assumed role and then separately upload the files to your own bucket using credentials from your own Account-B . 如果他们不希望这样做,那么您唯一的选择是使用假定的角色下载对象 ,然后使用来自您自己的Account-B凭据将文件分别上传到您自己的存储桶中。

I know this is old, but your code works just fine and there's no need to modify it.我知道这很旧,但您的代码运行良好,无需修改。 If you have the destination account (A) and the source account (B) then you correctly invoke Lambda in A to assume a role in B and copy files from S3 in B to S3 in A, just like you did.如果您有目标账户 (A) 和源账户 (B),那么您可以正确调用 A 中的 Lambda 以承担 B 中的角色,并将文件从 B 中的 S3 复制到 A 中的 S3,就像您所做的一样。 The core and actual solution to your problem is the proper configuration of permissions.您问题的核心和实际解决方案是正确配置权限。 Please, refer to these two AWS docs pages that actually solve the problem:请参考这两个实际解决问题的 AWS 文档页面:

https://aws.amazon.com/premiumsupport/knowledge-center/copy-s3-objects-account/ https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-assume-iam-role/ https://aws.amazon.com/premiumsupport/knowledge-center/copy-s3-objects-account/ https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-assume-iam-role/

One additional thing that I'd add though, is the ACL.不过,我要添加的另一件事是 ACL。 This will ensure that objects copied from B to A will be owned by the destination bucket's owner.这将确保从 B 复制到 A 的对象将由目标存储桶的所有者拥有。

dest_s3.copy(
            CopySource=...,
            Bucket=...,
            Key=...,
            ExtraArgs={
                "ACL": "bucket-owner-full-control"
            }
        )
  1. On source AWS account, add this policy to the source S3 bucket:在源 AWS 账户上,将此策略添加到源 S3 存储桶:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::SOURCE_BUCKET_NAME",
                "arn:aws:s3:::SOURCE_BUCKET_NAME/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::DESTINATION_BUCKET_NAME",
                "arn:aws:s3:::DESTINATION_BUCKET_NAME/*"
            ]
        }
    ]
}
  1. Using the destination account's credentials:使用目标帐户的凭据:
boto3_session = boto3.Session(aws_access_key_id=<your access key>,
                              aws_secret_access_key=<your secret_access_key>)
s3_resource = boto3_session.resource('s3')
bucket = s3_resource.Bucket("<source bucket name>")

for obj in bucket.objects.all():
    obj_path = str(obj.key)

    copy_source = {
        'Bucket': "<source bucket name>",
        'Key': obj_path
    }
    s3_resource.meta.client.copy(copy_source, "<destination bucket name>", obj_path)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM