简体   繁体   English

AWS提供对其他帐户的假定角色的访问权限,以访问我帐户中的S3

[英]AWS Providing access to assumed role from another account to access S3 in my account

What I am trying to achieve is to copy objects from S3 in one account (A1 - not controlled by me) into S3 in another account (A2 - controlled by me). 我想要实现的是将S3中的对象从一个帐户(A1 - 不受我控制)复制到另一个帐户中的S3(A2 - 由我控制)。 For that OPS from A1 provided me a role I can assume, using boto3 library. 因为A1的OPS为我提供了一个我可以假设的角色,使用boto3库。

session = boto3.Session()
sts_client = session.client('sts')

assumed_role = sts_client.assume_role(
    RoleArn="arn:aws:iam::1234567890123:role/organization",
    RoleSessionName="blahblahblah"
)

This part is ok. 这部分还可以。 Problem is that direct copy from S3 to S3 is failing because that assumed role cannot access my S3. 问题是从S3到S3的直接复制失败,因为假定的角色无法访问我的S3。

s3 = boto3.resource('s3')
copy_source = {
    'Bucket': a1_bucket_name,
    'Key': key_name
}

bucket = s3.Bucket(a2_bucket_name)
bucket.copy(copy_source, hardcoded_key)

As a result of this I get 由此我得到了

botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

in this line of code: 在这行代码中:

bucket.copy(copy_source, hardcoded_key)

Is there any way I can grant access to my S3 for that assumed role? 有什么办法可以授予我对这个假定角色的S3的访问权限吗? I would really like to have direct S3 to S3 copy without downloading file locally before uploading it again. 我真的希望直接S3到S3副本而不在本地下载文件再次上传之前。

Please advise if there is a better approach than this. 请告知是否有比这更好的方法。

Idea is to have this script running inside of a AWS Data Pipeline on daily basis for example. 想法是每天在AWS Data Pipeline中运行此脚本。

To copy objects from one S3 bucket to another S3 bucket, you need to use one set of AWS credentials that has access to both buckets. 要将对象从一个S3存储桶复制到另一个S3存储桶,您需要使用一组可以访问这两个存储桶的AWS凭据。

If those buckets are in different AWS accounts, you need 2 things: 如果这些存储桶位于不同的AWS账户中,则需要两件事:

  1. Credentials for the target bucket, and 目标存储桶的凭据,以及
  2. A bucket policy on the source bucket allowing read access to the target AWS account. 源存储桶上的存储桶策略,允许对目标AWS账户进行读访问。

With these alone, you can copy objects. 仅凭这些,您就可以复制对象。 You do not need credentials for the source account. 您不需要源帐户的凭据。

  1. Add a bucket policy to your source bucket allowing read access to the target AWS account. 将桶策略添加到存储桶,允许对目标AWS账户进行读取访问。

Here is a sample policy: 这是一个示例政策:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DelegateS3Access",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789012:root"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::BUCKET_NAME",
                "arn:aws:s3:::BUCKET_NAME/*"
            ]
        }
    ]
}

Be sure to replace BUCKET_NAME with your source bucket name. 请务必使用存储桶名称替换BUCKET_NAME And replace 123456789012 with your target AWS account number. 并将123456789012替换为您的目标 AWS账号。

  1. Using credentials for your target AWS account (the owner of the target bucket), perform the copy. 使用目标 AWS账户的凭证(目标存储桶的所有者),执行复制。

Additional Notes: 补充笔记:

You can also copy objects by reversing the two requirements: 您还可以通过颠倒两个要求来复制对象:

  1. Credentials for the source AWS account, and 源AWS账户的凭据,以及
  2. A bucket policy on the target bucket allowing write access to the source AWS account. 目标存储桶上的存储桶策略,允许对源AWS账户进行访问。

However, when done this way, object metadata does not get copied correctly. 但是,以这种方式完成时,对象元数据不会被正确复制。 I have discussed this issue with AWS Support, and they recommend reading from the foreign account rather than writing to the foreign account to avoid this problem. 我已经通过AWS Support讨论了这个问题,他们建议您从国外帐户读取而不是写入国外帐户以避免此问题。

This is a sample code to transfer data between two S3 buckets with 2 different AWS account using boto 3. 这是使用boto 3在2个不同AWS账户的两个S3桶之间传输数据的示例代码。

from boto.s3.connection import S3Connection
from boto.s3.key import Key
from Queue import LifoQueue
import threading

source_aws_key = '*******************'
source_aws_secret_key = '*******************'
dest_aws_key = '*******************'
dest_aws_secret_key = '*******************'
srcBucketName = '*******************'
dstBucketName = '*******************'

class Worker(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.source_conn = S3Connection(source_aws_key, source_aws_secret_key)
        self.dest_conn = S3Connection(dest_aws_key, dest_aws_secret_key)
        self.srcBucket = self.source_conn.get_bucket(srcBucketName)
        self.dstBucket = self.dest_conn.get_bucket(dstBucketName)
        self.queue = queue

    def run(self):
        while True:
            key_name = self.queue.get()
            k = Key(self.srcBucket, key_name)
            dist_key = Key(self.dstBucket, k.key)
            if not dist_key.exists() or k.etag != dist_key.etag:
                print 'copy: ' + k.key
                self.dstBucket.copy_key(k.key, srcBucketName, k.key, storage_class=k.storage_class)
            else:
                print 'exists and etag matches: ' + k.key

            self.queue.task_done()

def copyBucket(maxKeys = 1000):
    print 'start'

    s_conn = S3Connection(source_aws_key, source_aws_secret_key)
    srcBucket = s_conn.get_bucket(srcBucketName)

    resultMarker = ''
    q = LifoQueue(maxsize=5000)

    for i in range(10):
        print 'adding worker'
        t = Worker(q)
        t.daemon = True
        t.start()

    while True:
        print 'fetch next 1000, backlog currently at %i' % q.qsize()
        keys = srcBucket.get_all_keys(max_keys = maxKeys, marker = resultMarker)
        for k in keys:
            q.put(k.key)
        if len(keys) < maxKeys:
            print 'Done'
            break
        resultMarker = keys[maxKeys - 1].key

    q.join()
    print 'done'

if __name__ == "__main__":
    copyBucket()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM