简体   繁体   中英

How to get AWS Glue crawler to assume a role in another AWS account to get data from that account's S3 bucket?

There's some CSV data files I need to get in S3 buckets belonging to a series of AWS accounts belonging to a third-party; the owner of the other accounts has created a role in each of the accounts which grants me access to those files; I can use the AWS web console (logged in to my own account) to switch to each role and get the files. One at a time, I switch to the role for each of the accounts and then get the files for that account, then move on to the next account and get those files, and so on.

I'd like to automate this process.

It looks like AWS Glue can do this, but I'm having trouble with the permissions.

What I need it to do is create permissions so that an AWS Glue crawler can switch to the right role (belonging to each of the other AWS accounts) and get the data files from the S3 bucket of those accounts.

Is this possible and if so how can I set it up? (eg what IAM roles/permissions are needed?) I'd prefer to limit changes to my own account if possible rather than having to ask the other account owner to make changes on their side.

If it's not possible with Glue, is there some other easy way to do it with a different AWS service?

Thanks!

(I've had a series of tries but I keep getting it wrong - my attempts are so far from being right that there's no point in me posting the details here).

Using the AWS CLI, you can create named profiles for each of the roles you want to switch to, then refer to them from the CLI. You can then chain these calls, referencing the named profile for each role, and include them in a script to automate the process.

From Switching to an IAM Role (AWS Command Line Interface)

A role specifies a set of permissions that you can use to access AWS resources that you need. In that sense, it is similar to a user in AWS Identity and Access Management (IAM). When you sign in as a user, you get a specific set of permissions. However, you don't sign in to a role, but once signed in as a user you can switch to a role. This temporarily sets aside your original user permissions and instead gives you the permissions assigned to the role. The role can be in your own account or any other AWS account. For more information about roles, their benefits, and how to create and configure them, see IAM Roles, and Creating IAM Roles.

You can achieve this with AWS lambda and Cloudwatch Rules .

You can create a lambda function that has a role attached to it, lets call this role - Role A , depending on the number of accounts you can either create 1 function per account and create one rule in cloudwatch to trigger all functions or you can create 1 function for all the accounts (be cautious to the limitations of AWS Lambda).

Creating Role A

  1. Create an IAM Role (Role A) with the following policy allowing it to assume the role given to you by the other accounts containing the data.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1509358389000",
            "Effect": "Allow",
            "Action": [
                "sts:AssumeRole"
            ],
            "Resource": [
                "",
                "",
                ....
                "
            ]// all the IAM Role ARN's from the accounts containing the data or if you have 1 function for each account you can opt to have separate roles
        }
    ]
}

Also you will need to make sure that a trust relationship with all the accounts are present in Role A 's Trust Relationship policy document.

  1. Attach Role A to the lambda functions you will be running. you can use serverless for development. Now your lambda function has Role A attached to it and Role A has sts:AssumeRole permissions over the role's created in the other accounts.

  2. Assuming that you have created 1 function for 1 account in you lambda's code you will have to first use STS to switch to the role of the other account and obtain temporary credentials and pass these to S3 options before fetching the required data.

if you have created 1 function for all the accounts you can have the role ARN's in an array and iterate over it, again when doing this be aware of the limits of AWS lambda.

Yes, you can automate your scenario with Glue by following these steps:

  • Create an IAM role in your AWS account. This role's name must start with AWSGlueServiceRole but you can append whatever you want. Add a trust relationship for Glue, such as:

     { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "glue.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } 
  • Attach two IAM policies to your IAM role. The AWS managed policy named AWSGlueServiceRole and a custom policy that provides the access needed to all the target cross account S3 buckets, such as:

     { "Version": "2012-10-17", "Statement": [ { "Sid": "BucketAccess", "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::examplebucket1", "arn:aws:s3:::examplebucket2", "arn:aws:s3:::examplebucket3" ] }, { "Sid": "ObjectAccess", "Effect": "Allow", "Action": "s3:GetObject", "Resource": [ "arn:aws:s3:::examplebucket1/*", "arn:aws:s3:::examplebucket2/*", "arn:aws:s3:::examplebucket3/*" ] } ] } 
  • Add S3 bucket policies to each target bucket that allows your IAM role the same S3 access that you granted it in your account, such as:

     { "Version": "2012-10-17", "Statement": [ { "Sid": "BucketAccess", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::your_account_number:role/AWSGlueServiceRoleDefault" }, "Action": [ "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": "arn:aws:s3:::examplebucket1" }, { "Sid": "ObjectAccess", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::your_account_number:role/AWSGlueServiceRoleDefault" }, "Action": "s3:GetObject", "Resource": "arn:aws:s3:::examplebucket1/*" } ] } 
  • Finally, create Glue crawlers and jobs in your account (in the same regions as the target cross account S3 buckets) that will ETL the data from the cross account S3 buckets to your account.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM