简体   繁体   English

boto3 s3 文件上传使用 IAM 角色进行身份验证

[英]boto3 s3 file upload using IAM role for authentication

I made a code to upload the files to S3 using boto3.我编写了一个代码来使用 boto3 将文件上传到 S3。 The code runs in docker using cron job.该代码使用 cron 作业在 docker 中运行。 Initially I've set the aws credentials in the Dockerfile using ENV , and later switch to binding /home/$USER/.aws/ to the container to /root/.aws/ .最初,我使用ENV在 Dockerfile 中设置了 aws 凭据,然后切换到将/home/$USER/.aws/绑定到容器到/root/.aws/

FROM python:3.7-alpine

WORKDIR /scripts

RUN pip install boto3


COPY s3-file-upload-crontab /etc/crontabs/root
RUN chmod 644 /etc/crontabs/root

COPY s3_upload.py /scripts/s3_upload
RUN chmod a+x /scripts/s3_upload

RUN mkdir /root/info/
RUN touch /root/info/max_mod_time.json
RUN touch /root/info/error.log

RUN mkdir /root/.aws/
RUN touch /root/.aws/credentials
# RUN touch /root/.aws/config

version: '3.8'
    image: ap-aws-s3-file-upload 
      context: ./
      - ../data/features:/data
      - ./info:/root/info
      - ~/.aws/credentials:/root/.aws/credentials
      # - ~/.aws/config:/root/.aws/config

At this point the code is using my credentials (AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY) for authentication and works perfectly.此时代码正在使用我的凭据(AWS_ACCESS_KEY 和 AWS_SECRET_ACCESS_KEY)进行身份验证并且运行良好。

I'm trying to switch the authentication to IAM roles.我正在尝试将身份验证切换为 IAM 角色。 I've created a role in AWS called Upload_Data_To_S3 with the AmazonS3FullAccess policy.我使用AmazonS3FullAccess策略在 AWS 中创建了一个名为Upload_Data_To_S3的角色。

I'm reading the docs on how to set up boto3 for IAM roles.我正在阅读有关如何为 IAM 角色设置 boto3 的文档 I've set my ~/.aws/config/ as follows我已经设置了我的~/.aws/config/如下


[profile crossaccount]

I don't have aws cli installed so no profile, besides my user on aws account.我没有安装 aws cli,所以没有配置文件,除了我的 aws 帐户用户。 My python code contains no code to do with authentication.我的 python 代码不包含与身份验证有关的代码。


import boto3
from botocore.errorfactory import ClientError
import os
import glob
import json
import time

# TODO: look into getting credentials from IAM role
s3_client = boto3.client('s3')
s3_bucket_name = 'ap-rewenables-feature-data'

max_mod_time = '0'
file_list = glob.glob('/data/*.json')  # get a list of feature files
file_mod_time = None

# get mod time for all files in data directory
file_info = [{'file': file, 'mod_time': time.strftime(
    '%Y-%m-%d %H:%M:%S', time.gmtime(os.path.getmtime(file)))} for file in file_list]

# sort files my mod time (min -> max)
timestamp_sorted_file_info = sorted(file_info, key=lambda f: f['mod_time'])
# print('File Info Sorted by Time Stamp:\n',timestamp_sorted_file_info)

# check if the file exists and not empty -> set max_mod_time from it
if os.path.exists('/root/info/max_mod_time.json') and os.stat('/root/info/max_mod_time.json').st_size != 0:
    with open('/root/info/max_mod_time.json', 'r') as mtime:
        max_mod_time = json.load(mtime)['max_mod_time']

# upload the files to s3
mod_time_last_upload = "0"
for file in timestamp_sorted_file_info:
    file_mod_time = file['mod_time']  # set mod time for the current file
    # file_mod_time = '2020-09-19 13:28:53' # for debugging
    file_name = os.path.basename(file['file'])  # get file name from file path

    if file_mod_time > max_mod_time:  # compare current file mod_time to max_mod_time from previous run
        with open(os.path.join('/data/', file_name), "rb") as f:
            s3_client.upload_fileobj(f, s3_bucket_name, file_name)

            # error check - https://stackoverflow.com/a/38376288/7582937
            # check if the file upload was successful
                s3_client.head_object(Bucket=s3_bucket_name, Key=file_name)
                mod_time_last_upload = file_mod_time
                print(file_name, ' is UPLOADED')
            except ClientError as error:
                # Not found
                if error.response['ResponseMetadata']['HTTPStatusCode'] == 404:
                    # save error to log file
                    open('/root/info/error.log', 'w').write(str(error))
                    print("error: ", error)

        print('File Mod Time: ', file_mod_time)
        print('Mod Time Last Upload: ', mod_time_last_upload)

# save max mod time to file
# https://stackoverflow.com/a/5320889/7582937
# create JSON object to write to the file
object_to_write = json.dumps(
    {"max_mod_time": mod_time_last_upload})

# write max_mod_time to the file to be passed to the next run
if mod_time_last_upload is not "0":
    if object_to_write:
        open('/root/info/max_mod_time.json', 'w').write(str(object_to_write))

When I build and run the container I get the following error:当我构建并运行容器时,出现以下错误:

Traceback (most recent call last):
  File "/scripts/s3_upload", line 40, in <module>
    s3_client.upload_fileobj(f, s3_bucket_name, file_name)
  File "/usr/local/lib/python3.7/site-packages/boto3/s3/inject.py", line 539, in upload_fileobj
    return future.result()
  File "/usr/local/lib/python3.7/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/usr/local/lib/python3.7/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/usr/local/lib/python3.7/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/usr/local/lib/python3.7/site-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 337, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 643, in _make_api_call
    operation_model, request_dict, request_context)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 662, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/usr/local/lib/python3.7/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/usr/local/lib/python3.7/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/usr/local/lib/python3.7/site-packages/botocore/endpoint.py", line 116, in create_request
  File "/usr/local/lib/python3.7/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/usr/local/lib/python3.7/site-packages/botocore/signers.py", line 160, in sign
  File "/usr/local/lib/python3.7/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

That's understandable since I don't have the credentials in the container.这是可以理解的,因为我在容器中没有凭据。 What do I need to add to the code or the ~/.aws/config file for it to use the IAM role I've set up?我需要在代码或~/.aws/config文件中添加什么才能使用我设置的 IAM 角色? Unfortunately the docs aren't very clear in this regard.不幸的是,文档在这方面不是很清楚。

Thanks in advance.提前致谢。

Try this:尝试这个:

import boto3

session = boto3.Session(profile_name="crossaccount")
s3 = session.client("s3")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM