[英]Error when deploying cross account Sagemaker Endpoints
I am using cdk to deploy a Sagemaker Endpoint in a cross-account context.我正在使用 cdk 在跨账户上下文中部署 Sagemaker 端点。
The following error appears when creating the Sagemaker Endpoint: Failed to download model data for container "container_1" from URL: "s3://.../model.tar.gz".创建 Sagemaker Endpoint 时出现以下错误:无法从 URL 下载容器“container_1”的 model 数据:“s3://.../model.tar.gz”。 Please ensure that there is an object located at the URL and that the role passed to CreateModel has permissions to download the object.
请确保 object 位于 URL 并且传递给 CreateModel 的角色有权下载 object。
Here are some useful details.这里有一些有用的细节。
I have two accounts:我有两个帐户:
In AccountA:在帐户 A 中:
// Create bucket and kms key to be used by Sagemaker Pipeline
//KMS
const sagemakerKmsKey = new Key(
this,
"SagemakerBucketKMSKey",
{
description: "key used for encryption of data in Amazon S3",
enableKeyRotation: true,
policy: new PolicyDocument(
{
statements:[
new PolicyStatement(
{
actions:["kms:*"],
effect: Effect.ALLOW,
resources:["*"],
principals: [new AccountRootPrincipal()]
}
),
new PolicyStatement(
{
actions:[
"kms:*"
],
effect: Effect.ALLOW,
resources:["*"],
principals: [
new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountA}:root`),
new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountB}:root`),
]
}
)
]
}
)
}
)
// S3 Bucket
const sagemakerArtifactBucket = new Bucket(
this,
"SagemakerArtifactBucket",
{
bucketName:`mlops-${projectName}-${Aws.REGION}`,
encryptionKey:sagemakerKmsKey,
versioned:false,
removalPolicy: RemovalPolicy.DESTROY
}
)
sagemakerArtifactBucket.addToResourcePolicy(
new PolicyStatement(
{
actions: [
"s3:*",
],
resources: [
sagemakerArtifactBucket.bucketArn,
`${sagemakerArtifactBucket.bucketArn}/*`
],
principals: [
new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountA}:root`),
new ArnPrincipal(`arn:${Aws.PARTITION}:iam::${AccountB}:root`),
]
}
)
)
// Define Code Build Deploy Staging Action
const deployStagingAction = new CloudFormationCreateUpdateStackAction(
{
actionName: "DeployStagingAction",
runOrder: 1,
adminPermissions: false,
stackName: `${projectName}EndpointStaging`,
templatePath: cdKSynthArtifact.atPath("staging.template.json"),
replaceOnFailure: true,
role: Role.fromRoleArn(
this,
"StagingActionRole",
`arn:${Aws.PARTITION}:iam::${AccountB}:role/cdk-hnb659fds-deploy-role-${AccountB}-${Aws.REGION}`,
),
deploymentRole: Role.fromRoleArn(
this,
"StagingDeploymentRole",
`arn:${Aws.PARTITION}:iam::${AccountB}:role/cdk-hnb659fds-cfn-exec-role-${AccountB}-${Aws.REGION}`
),
cfnCapabilities: [
CfnCapabilities.AUTO_EXPAND,
CfnCapabilities.NAMED_IAM
]
}
)
Specifically, the role that creates the Sagemaker Model and Sagemaker Endpoints should be cdk-hnb659fds-cfn-exec-role, as seen on CloudTrail, but for testing purposes I've granted to both of them Administrator privileges (the error still appears).具体来说,创建 Sagemaker Model 和 Sagemaker 端点的角色应该是 cdk-hnb659fds-cfn-exec-role,如 CloudTrail 所示,但出于测试目的,我已授予他们两个管理员权限(错误仍然出现)。
The deployment in AccountA is correctly executed, thus it means that the bucket location is correct. AccountA中的部署是正确执行的,也就是说bucket位置是正确的。
NOTE: everything is deployed correctly up to the Sagemaker Endpoint.注意:一切都正确部署到 Sagemaker 端点。
I managed to find the issue.我设法找到了问题。
The problem was that, even though the bucket was created with a custom KMSKey, the artifacts stored into the bucket are generated by an Estimator .问题在于,即使存储桶是使用自定义 KMSKey 创建的,存储在存储桶中的工件也是由Estimator生成的。 If you do not specify the output_kms_key paramter, it will use a managed kms key, which is different from the one used for the s3 bucket.
如果您不指定output_kms_key参数,它将使用托管 kms 密钥,该密钥与用于 s3 存储桶的密钥不同。
Even though the issue is not related to cross account permissions, I'll leave it here in case someone has a similar issue.即使该问题与跨帐户权限无关,我也会将其留在这里,以防有人遇到类似问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.