简体   繁体   中英

Error when deploying cross account Sagemaker Endpoints

I am using cdk to deploy a Sagemaker Endpoint in a cross-account context.

The following error appears when creating the Sagemaker Endpoint: Failed to download model data for container "container_1" from URL: "s3://.../model.tar.gz". Please ensure that there is an object located at the URL and that the role passed to CreateModel has permissions to download the object.

Here are some useful details.

I have two accounts:

  • Account A: includes the encrypted s3 bucket in which the model artifact has been saved, the Sagemaker model package group with the latest approved version and a CodePipeline that deploys the endpoint in the account A itself and account B.
  • Account B: includes the endpoint deployed by CodePipeline in Account A.

In AccountA:

  • The cross account permissions are set both for the bucket and the kms key used to encrypt that bucket
// Create bucket and kms key to be used by Sagemaker Pipeline

        //KMS
        const sagemakerKmsKey = new Key(
            this,
            "SagemakerBucketKMSKey",
            {
                description: "key used for encryption of data in Amazon S3",
                enableKeyRotation: true,
                policy: new PolicyDocument(
                    {
                        statements:[
                            new PolicyStatement(
                                {
                                    actions:["kms:*"],
                                    effect: Effect.ALLOW,
                                    resources:["*"],
                                    principals: [new AccountRootPrincipal()]
                                }
                            ),
                            new PolicyStatement(
                                {
                                    actions:[
                                        "kms:*"
                                    ],
                                    effect: Effect.ALLOW,
                                    resources:["*"],
                                    principals: [
                                        new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountA}:root`),
                                        new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountB}:root`),
                                    ]
                                }
                            )
                        ]
                    }
                )
            }
        )

        // S3 Bucket
        const sagemakerArtifactBucket = new Bucket(
            this,
            "SagemakerArtifactBucket",
            {
                bucketName:`mlops-${projectName}-${Aws.REGION}`,
                encryptionKey:sagemakerKmsKey,
                versioned:false,
                removalPolicy: RemovalPolicy.DESTROY
            }
        )
        
        sagemakerArtifactBucket.addToResourcePolicy(
            new PolicyStatement(
                {
                    actions: [
                        "s3:*",
                    ],
                    resources: [
                        sagemakerArtifactBucket.bucketArn,
                        `${sagemakerArtifactBucket.bucketArn}/*`
                    ],
                    principals: [
                        new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountA}:root`),
                        new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountB}:root`),
                    ]
                }
            )
        )
  • A CodeDeploy Action is used to deploy the Sagemaker Endpoint in AccountA and AccountB.
// Define Code Build Deploy Staging Action
        const deployStagingAction = new CloudFormationCreateUpdateStackAction(
            {
                actionName: "DeployStagingAction",
                runOrder: 1,
                adminPermissions: false,
                stackName: `${projectName}EndpointStaging`,
                templatePath: cdKSynthArtifact.atPath("staging.template.json"),
                replaceOnFailure: true,
                role: Role.fromRoleArn(
                    this,
                    "StagingActionRole",
                    `arn:${Aws.PARTITION}:iam::${AccountB}:role/cdk-hnb659fds-deploy-role-${AccountB}-${Aws.REGION}`,
                ),
                deploymentRole: Role.fromRoleArn(
                    this,
                    "StagingDeploymentRole",
                    `arn:${Aws.PARTITION}:iam::${AccountB}:role/cdk-hnb659fds-cfn-exec-role-${AccountB}-${Aws.REGION}`
                ),
                cfnCapabilities: [
                    CfnCapabilities.AUTO_EXPAND,
                    CfnCapabilities.NAMED_IAM
                ]
            }
        )

Specifically, the role that creates the Sagemaker Model and Sagemaker Endpoints should be cdk-hnb659fds-cfn-exec-role, as seen on CloudTrail, but for testing purposes I've granted to both of them Administrator privileges (the error still appears).

The deployment in AccountA is correctly executed, thus it means that the bucket location is correct.

NOTE: everything is deployed correctly up to the Sagemaker Endpoint.

I managed to find the issue.

The problem was that, even though the bucket was created with a custom KMSKey, the artifacts stored into the bucket are generated by an Estimator . If you do not specify the output_kms_key paramter, it will use a managed kms key, which is different from the one used for the s3 bucket.

Even though the issue is not related to cross account permissions, I'll leave it here in case someone has a similar issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM