简体   繁体   English

AWS Glue Crawler 访问被拒绝并附加 AmazonS3FullAccess

[英]AWS Glue Crawler Access Denied with AmazonS3FullAccess attached

I've just set up an AWS Glue crawler to crawl an S3 bucket.我刚刚设置了一个 AWS Glue 爬网程序来爬取 S3 存储桶。 I've set up an IAM Role for the crawler and attached the managed policies "AWSGlueServiceRole" and "AmazonS3FullAccess" to the Role.我为爬网程序设置了一个 IAM 角色,并将托管策略“AWSGlueServiceRole”和“AmazonS3FullAccess”附加到该角色。 I've ensured that the crawler is using the role.我已确保爬虫正在使用该角色。 However, every time I run the crawler I get an error message similar to this in the logs:但是,每次运行爬虫时,我都会在日志中收到与此类似的错误消息:

ERROR : Error Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: <omitted>; S3 Extended Request ID: <omitted>) retrieving file at s3://my-bucket/snapshots/snapshot-1/mydb/mydb.mytable/11/part-00000-ffffffff-ffff-ffff-ffff-ffffffffffff-c000.gz.parquet. Tables created did not infer schemas from this file.

I've confirmed that a Lambda with "AmazonS3ReadOnlyAccess" attached to its execution role is able to access the bucket.我已确认其执行角色附加了“AmazonS3ReadOnlyAccess”的 Lambda 能够访问存储桶。 What am I doing wrong?我究竟做错了什么?

EDIT: Setting "block all public access" or disabling same has no appreciable effect.编辑:设置“阻止所有公共访问”或禁用相同没有明显的效果。

EDIT2: The managed policy documents for the IAM Role are as follows. EDIT2:IAM 角色的托管策略文档如下。 There are no inline policies.没有内联策略。

AWSGlueServiceRole: AWSGlueService角色:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "glue:*",
                "s3:GetBucketLocation",
                "s3:ListBucket",
                "s3:ListAllMyBuckets",
                "s3:GetBucketAcl",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeRouteTables",
                "ec2:CreateNetworkInterface",
                "ec2:DeleteNetworkInterface",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcAttribute",
                "iam:ListRolePolicies",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "cloudwatch:PutMetricData"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket"
            ],
            "Resource": [
                "arn:aws:s3:::aws-glue-*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::aws-glue-*/*",
                "arn:aws:s3:::*/*aws-glue-*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::crawler-public*",
                "arn:aws:s3:::aws-glue-*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:*:*:/aws-glue/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags",
                "ec2:DeleteTags"
            ],
            "Condition": {
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "aws-glue-service-resource"
                    ]
                }
            },
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:security-group/*",
                "arn:aws:ec2:*:*:instance/*"
            ]
        }
    ]
}

AmazonS3FullAccess: AmazonS3FullAccess:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": "*"
        }
    ]
}

Turns out the problem was KMS.原来问题是KMS。 The bucket contained an export of an Aurora RDS snapshot, and the snapshot was apparently written encrypted.该存储桶包含 Aurora RDS 快照的导出,并且该快照显然是加密写入的。 So once I added the following policy, I was set:所以一旦我添加了以下策略,我就设置了:

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": [
      "kms:Decrypt"
    ],
    "Resource": [
      "arn:aws:kms:<region>:<my account id>:key/<my key id>"
    ]
  }
}

Here is my entire managed policy attached to the role (note that the role also has AWSGlueServiceRole attached):这是我附加到角色的整个托管策略(请注意,该角色还附加了AWSGlueServiceRole ):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket/snapshots*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:<region>:<my account id>:key/<my key id>"
            ]
        }
    ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM