简体   繁体   English

使用 Python 将文件上传到 S3

[英]Uploading files to S3 using Python

I have a list of file URLs which are download links.我有一个文件 URL 列表,它们是下载链接。 I have written Python code to download the files to my computer.我已经编写了 Python 代码来将文件下载到我的计算机上。 Here's the problem, there are about 500 files in the list and Chrome becomes unresponsive after downloading about 50 of these files.问题来了,列表中有大约 500 个文件,Chrome 在下载了大约 50 个文件后变得没有响应。 My initial goal was to upload all the files that I have downloaded to a Bucket in s3.我最初的目标是将我下载的所有文件上传到 s3 中的 Bucket。 Is there a way to make the files go to S3 directly?有没有办法让文件直接转到S3? Here is what I have written so far:这是我到目前为止所写的内容:

import requests
from itertools import chain
import webbrowser

url = "<my_url>"
username = "<my_username>"
password = "<my_password>"
headers = {"Content-Type":"application/xml","Accept":"*/*"}

response = requests.get(url, auth=(username, password), headers = headers)
if response.status_code != 200:
    print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:', response.json())
    exit()

data = response.json()
values = list(chain.from_iterable(data.values()))
links = [lis['download_link'] for lis in values]
for item in links:
    webbrowser.open(item)

Its quite simple using python3 and boto3 (AWS SDK), eg.:使用python3和boto3(AWS SDK)非常简单,例如:

import boto3

s3 = boto3.client('s3')
with open('filename.txt', 'rb') as data:
    s3.upload_fileobj(data, 'bucketname', 'filenameintos3.txt')

for more information you can read boto3 documentation here: http://boto3.readthedocs.io/en/latest/guide/s3-example-creating-buckets.html有关更多信息,您可以在此处阅读 boto3 文档: http ://boto3.readthedocs.io/en/latest/guide/s3-example-creating-buckets.html

Enjoy享受

If you have the aws cli installed on your system you can make use of subprocess library.如果您的系统上安装了aws cli ,则可以使用subprocess库。 For example:例如:

import subprocess
def copy_file_to_s3(source: str, target: str, bucket: str):
   subprocess.run(["aws", "s3" , "cp", source, f"s3://{bucket}/{target}"])

Similarly you can use that logics for all sort of AWS client operations like downloading or listing files etc. This way there is no need to import Boto3.同样,您可以将该逻辑用于各种 AWS 客户端操作,例如下载或列出文件等。这样就无需导入 Boto3。 I guess its use is not intended that way but in practice I find it quite convenient that way.我想它的用途不是那样的,但在实践中我发现那样很方便。 This way you also get the status of the upload displayed in your console - for example:通过这种方式,您还可以获得控制台中显示的上传状态 - 例如:

Completed 3.5 GiB/3.5 GiB (242.8 MiB/s) with 1 file(s) remaining

To modify the method to your wishes I recommend having a look into the subprocess reference as well as to the AWS Cli reference .要根据您的意愿修改该方法,我建议您查看子流程参考以及AWS Cli 参考

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM