簡體   English   中英

從 Postgres RDS 到 Redshift 的 AWS DMS 復制任務在 S3 存儲桶上獲得 AccessDenied

[英]AWS DMS replication task from Postgres RDS to Redshift getting AccessDenied on S3 bucket

我們已經部署了一個 DMS 復制任務來將我們的整個 Postgres 數據庫復制到 Redshift。 這些表是使用正確的模式創建的,但數據沒有進入 Redshift 並被保存在 DMS 用作中間步驟的 S3 存儲桶中。 這都是通過 Terraform 部署的。

我們已按照復制實例 Terraform 文檔中所述配置 IAM 角色,並創建了所有三個 dms dms-access-for-endpointdms-cloudwatch-logs-roledms-vpc-role IAM 角色。 IAM 角色通過不同的堆棧部署到 DMS 的部署位置,因為角色由另一個成功部署的 DMS 實例使用,該實例運行不同的任務。

data "aws_iam_policy_document" "dms_assume_role_document" {
  statement {
    actions = ["sts:AssumeRole"]

    principals {
      identifiers = [
        "s3.amazonaws.com",
        "iam.amazonaws.com",
        "redshift.amazonaws.com",
        "dms.amazonaws.com",
        "redshift-serverless.amazonaws.com"
      ]
      type        = "Service"
    }
  }
}

# Database Migration Service requires the below IAM Roles to be created before
# replication instances can be created. See the DMS Documentation for
# additional information: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Security.html#CHAP_Security.APIRole
#  * dms-vpc-role
#  * dms-cloudwatch-logs-role
#  * dms-access-for-endpoint
resource "aws_iam_role" "dms_access_for_endpoint" {
  name                  = "dms-access-for-endpoint"
  assume_role_policy    = data.aws_iam_policy_document.dms_assume_role_document.json
  managed_policy_arns   = ["arn:aws:iam::aws:policy/service-role/AmazonDMSRedshiftS3Role"]
  force_detach_policies = true
}

resource "aws_iam_role" "dms_cloudwatch_logs_role" {
  name                  = "dms-cloudwatch-logs-role"
  description           = "Allow DMS to manage CloudWatch logs."
  assume_role_policy    = data.aws_iam_policy_document.dms_assume_role_document.json
  managed_policy_arns   = ["arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole"]
  force_detach_policies = true
}

resource "aws_iam_role" "dms_vpc_role" {
  name                  = "dms-vpc-role"
  description           = "DMS IAM role for VPC permissions"
  assume_role_policy    = data.aws_iam_policy_document.dms_assume_role_document.json
  managed_policy_arns   = ["arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole"]
  force_detach_policies = true
}

但是,在運行時,我們會在 CloudWatch 中看到以下日志:

2022-09-01T16:51:38 [SOURCE_UNLOAD   ]E:  Not retriable error: <AccessDenied> Access Denied [1001705]  (anw_retry_strategy.cpp:118)
2022-09-01T16:51:38 [SOURCE_UNLOAD   ]E:  Failed to list bucket 'dms-sandbox-redshift-intermediate-storage': error code <AccessDenied>: Access Denied [1001713]  (s3_dir_actions.cpp:105)
2022-09-01T16:51:38 [SOURCE_UNLOAD   ]E:  Failed to list bucket 'dms-sandbox-redshift-intermediate-storage' [1001713]  (s3_dir_actions.cpp:209)

我們還在存儲桶本身上啟用了 S3 服務器訪問日志,以查看這是否會為我們提供更多信息。 這是我們所看到的(匿名):

<id> dms-sandbox-redshift-intermediate-storage [01/Sep/2022:15:43:32 +0000] 10.128.69.80 arn:aws:sts::<account>:assumed-role/dms-access-for-endpoint/dms-session-for-replication-engine <code> REST.GET.BUCKET - "GET /dms-sandbox-redshift-intermediate-storage?delimiter=%2F&max-keys=1000 HTTP/1.1" 403 AccessDenied 243 - 30 - "-" "aws-sdk-cpp/1.8.80/S3/Linux/4.14.276-211.499.amzn2.x86_64 x86_64 GCC/4.9.3" - <code> SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3.eu-west-2.amazonaws.com TLSv1.2 -

以上表明服務dms-session-for-replication是接收 AccessDenied 響應的相關服務,但我們無法確定這是什么以及如何修復它。

我們嘗試將存儲桶策略添加到 S3 存儲桶本身,但這不起作用(這也包括 S3 服務器訪問日志存儲桶):

resource "aws_s3_bucket" "dms_redshift_intermediate" {
  # Prefixed with `dms-` as that's what the AmazonDMSRedshiftS3Role policy filters on
  bucket = "dms-sandbox-redshift-intermediate-storage"
}

resource "aws_s3_bucket_logging" "log_bucket" {
  bucket        = aws_s3_bucket.dms_redshift_intermediate.id
  target_bucket = aws_s3_bucket.log_bucket.id
  target_prefix = "log/"
}

resource "aws_s3_bucket" "log_bucket" {
  bucket = "${aws_s3_bucket.dms_redshift_intermediate.id}-logs"
}

resource "aws_s3_bucket_acl" "log_bucket" {
  bucket = aws_s3_bucket.log_bucket.id
  acl    = "log-delivery-write"
}

resource "aws_s3_bucket_policy" "dms_redshift_intermediate_policy" {
  bucket = aws_s3_bucket.dms_redshift_intermediate.id
  policy = data.aws_iam_policy_document.dms_redshift_intermediate_policy_document.json
}

data "aws_iam_policy_document" "dms_redshift_intermediate_policy_document" {
  statement {
    actions = [
      "s3:*"
    ]

    principals {
      identifiers = [
        "dms.amazonaws.com",
        "redshift.amazonaws.com"
      ]
      type = "Service"
    }

    resources = [
      aws_s3_bucket.dms_redshift_intermediate.arn,
      "${aws_s3_bucket.dms_redshift_intermediate.arn}/*"
    ]
  }
}

我們如何解決我們在 CloudWatch 上看到的<AccessDenied>問題並啟用將數據加載到 Redshift? PUT能夠將項目放入 S3 存儲桶中,因為我們看到加密的 CSV 出現在其中(服務器訪問日志也證實了這一點),但 DMS 無法為 Redshift GET文件。 AccessDenied 響應還表明這是 IAM 角色問題而不是安全組問題,但我們的 IAM 角色是根據文檔配置的,因此我們對可能導致此問題的原因感到困惑。

沒錯,這是一個 IAM 角色問題,請確保問題中的角色已將以下語句添加到策略文檔中,

{
  "Effect": "Allow",
    "Action": [
      "s3:ListBucket"
    ],
      "Resource":"arn:aws:s3:::<yourbucketnamehere>"
},
  {
    "Effect": "Allow",
      "Action": [
        "s3:ListAllMyBuckets",
        "s3:GetBucketLocation"
      ],
        "Resource": "arn:aws:s3:::*"
  }

我們認為是 IAM 問題,實際上是安全組問題。 Redshift 的COPY命令難以訪問 S3。 通過將 HTTPS 的 443 出口規則添加到 Redshift 安全組,我們能夠再次提取數據

resource "aws_security_group_rule" "https_443_egress" {
  type              = "egress"
  description       = "Allow HTTP egress from DMS SG"
  protocol          = "tcp"
  to_port           = 443
  from_port         = 443
  security_group_id = aws_security_group.redshift.id
  cidr_blocks       = ["0.0.0.0/0"]
}

因此,如果您遇到與問題相同的問題,請檢查 Redshift 是否可以通過 HTTPS 訪問 S3。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM