简体   繁体   English

boto3和AWS Athena权限

[英]boto3 and AWS Athena permission

I am trying to use boto3 , v. 1.7.4, to interact with AWS Athena through the following script: 我正在尝试使用boto3 1.7.4版通过以下脚本与AWS Athena进行交互:

import boto3
import botocore

# Test access to the input bucket
bucket = boto3.resource('s3').Bucket('s3_input')
print(list(bucket.objects.all())

client = boto3.client('athena', region_name='us-east-1')

# Create a new database
db_query = 'CREATE DATABASE IF NOT EXISTS france;'
response = client.start_query_execution(
    QueryString=db_query,
    ResultConfiguration={'OutputLocation': 's3_output'})

# Create a new table
table_query = '''
CREATE EXTERNAL TABLE IF NOT EXISTS france.by_script (`content` string ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('separatorChar' = ',')
LOCATION 's3_input';'''

response = client.start_query_execution(
    QueryString=table_query,
    ResultConfiguration={'s3_output'},
    QueryExecutionContext={'Database': 'france'})

With the current permissions of my account, the test to read the content of s3_input works well. 使用我帐户的当前权限,读取s3_input内容的s3_input效果很好。 I can also create the database through the db_query but the table creation fails with the following error message: 我也可以通过db_query创建数据库,但是表创建失败并显示以下错误消息:

Your query has the following errors:FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:Got exception: java.io.IOException
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS
Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
AccessDenied; Request ID: [...]), S3 Extended Request ID: [...])

If I run the table_query command from the console, console.aws.amazon.com/athena/home , using the same account, there is no problem and the table is properly created. 如果我使用相同的帐户从控制台console.aws.amazon.com/athena/home运行table_query命令,则不会出现问题,并且表已正确创建。

The permissions are 权限是

{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Sid": "VisualEditor0",
           "Effect": "Allow",
           "Action": "s3:GetObject",
           "Resource": "s3_input"
       },
       {
           "Sid": "VisualEditor1",
           "Effect": "Allow",
           "Action": [
               "s3:ListAllMyBuckets",
               "s3:HeadBucket"
           ],
           "Resource": "*"
       }
   ]
}

I would be happy to understand what I am missing here. 我很高兴了解我在这里缺少什么。 Thanks in advance. 提前致谢。

It turns out that the following permissions make it work 事实证明,以下权限使其有效

{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Effect": "Allow",
           "Action": [
               "s3:Get*",
               "s3:List*"
           ],
           "Resource": "*"
       }
   ]
}

Here is the way to create policy for the user who needs to run athena query from Boto3. 这是为需要从Boto3运行athena查询的用户创建策略的方法。

-- S3 files bucket: sqladmin-cloudtrail
-- S3 output bucket: aws-athena-query-results-XXXXXXXXXX-us-east-1

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::aws-athena-query-results-XXXXXXXXXX-us-east-1",
                "arn:aws:s3:::sqladmin-cloudtrail"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::aws-athena-query-results-XXXXXXXXXXXXXXXX-us-east-1/*"
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "s3:GetObjectAcl",
                "s3:GetObject",
                "s3:GetObjectTagging",
                "s3:GetBucketPolicy"
            ],
            "Resource": [
                "arn:aws:s3:::sqladmin-cloudtrail",
                "arn:aws:s3:::sqladmin-cloudtrail/*"
            ]
        },
        {
            "Sid": "VisualEditor3",
            "Effect": "Allow",
            "Action": [
                "athena:StartQueryExecution",
                "athena:CreateNamedQuery",
                "athena:RunQuery"
            ],
            "Resource": "*"
        }
    ]
}

Here is my blog I did for an automation: https://www.sqlgossip.com/automate-aws-athena-create-partition-on-daily-basis/ 这是我为自动化所做的博客: https : //www.sqlgossip.com/automate-aws-athena-create-partition-on-daily-basis/

I ran into the same problem as above, but in addition to the permissions mentioned by Flavien in the answer above my process (a Lambda function) needed to add also s3:PutObject and s3:AbortMultipartUpload . 我遇到了与上述相同的问题,但是除了Flavien在上面的回答中提到的权限(我的过程(一个Lambda函数))之外,还需要添加s3:PutObjects3:AbortMultipartUpload

Athena apparently creates objects named like folderName_$Folder$ in the SOURCE data folders, so it needs to have PutObject permission to that (not just read-only). 雅典娜显然在SOU​​RCE数据文件夹中创建了名为folderName_ $ Folder $之类的对象,因此它需要对此具有PutObject权限(不仅仅是只读)。 Don't ask me why the AbortMultipartUpload is needed... but it comes straight from Athena docs at https://docs.aws.amazon.com/athena/latest/ug/access.html 不要问我为什么需要AbortMultipartUpload ...但是它直接来自Athena文档, 网址https://docs.aws.amazon.com/athena/latest/ug/access.html

The entire statement for your IAM policy looks like this: 您的IAM政策的整个声明如下所示:

        {
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload",
                "s3:CreateBucket",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::your-source-data-bucket-name*"
            ]
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM