简体   繁体   English

将文件从 s3 Bucket 下载到 USERS 计算机

[英]Downloading a file from an s3 Bucket to the USERS computer

Goal目标

Download file from s3 Bucket to users computer.将文件从 s3 Bucket 下载到用户计算机。

Context语境

I am working on a Python/Flask API for a React app.我正在为 React 应用程序开发 Python/Flask API。 When the user clicks the Download button on the Front-End, I want to download the appropriate file to their machine.当用户单击前端上的下载按钮时,我想将适当的文件下载到他们的机器上。

What I've tried我试过的

import boto3 s3 = boto3.resource('s3') s3.Bucket('mybucket').download_file('hello.txt', '/tmp/hello.txt')

I am currently using some code that finds the path of the downloads folder and then plugging that path into download_file() as the second parameter, along with the file on the bucket that they are trying to download.我目前正在使用一些代码来查找下载文件夹的路径,然后将该路径插入 download_file() 作为第二个参数,以及他们尝试下载的存储桶上的文件。

This worked locally, and tests ran fine, but I run into a problem once it is deployed.这在本地工作,并且测试运行良好,但是一旦部署就遇到了问题。 The code will find the downloads path of the SERVER, and download the file there.该代码将找到 SERVER 的下载路径,并在那里下载文件。

Question

What is the best way to approach this?解决这个问题的最佳方法是什么? I have researched and cannot find a good solution for being able to download a file from the s3 bucket to the users downloads folder.我已经研究过但找不到能够将文件从 s3 存储桶下载到用户下载文件夹的好的解决方案。 Any help/advice is greatly appreciated.非常感谢任何帮助/建议。

You should not need to save the file to the server.您不需要将文件保存到服务器。 You can just download the file into memory, and then build a Response object containing the file.您可以将文件下载到内存中,然后构建一个包含该文件的Response对象。

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    file = s3.get_object(Bucket='blah-test1', Key='blah.txt')
    return Response(
        file['Body'].read(),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This is ok for small files, there won't be any meaningful wait time for the user.这对于小文件来说没问题,用户不会有任何有意义的等待时间。 However with larger files, this well affect UX.然而,对于较大的文件,这会影响用户体验。 The file will need to be completely downloaded to the server, then download to the user.该文件需要完全下载到服务器,然后下载给用户。 So to fix this issue, use the Range keyword argument of the get_object method:因此,要解决此问题,请使用get_object方法的Range关键字参数:

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


def get_total_bytes(s3):
    result = s3.list_objects(Bucket='blah-test1')
    for item in result['Contents']:
        if item['Key'] == 'blah.txt':
            return item['Size']


def get_object(s3, total_bytes):
    if total_bytes > 1000000:
        return get_object_range(s3, total_bytes)
    return s3.get_object(Bucket='blah-test1', Key='blah.txt')['Body'].read()


def get_object_range(s3, total_bytes):
    offset = 0
    while total_bytes > 0:
        end = offset + 999999 if total_bytes > 1000000 else ""
        total_bytes -= 1000000
        byte_range = 'bytes={offset}-{end}'.format(offset=offset, end=end)
        offset = end + 1 if not isinstance(end, str) else None
        yield s3.get_object(Bucket='blah-test1', Key='blah.txt', Range=byte_range)['Body'].read()


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    total_bytes = get_total_bytes(s3)

    return Response(
        get_object(s3, total_bytes),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This will download the file in 1MB chunks and send them to the user as they are downloaded.这将以 1MB 的块下载文件,并在下载时将它们发送给用户。 Both of these have been tested with a 40MB .txt file.这两个都已使用 40MB .txt文件进行了测试。

A better way to solve this problem is to create presigned url .解决此问题的更好方法是创建 presigned url This gives you a temporary URL that's valid up to a certain amount of time.这为您提供了一个在一定时间内有效的临时 URL。 It also removes your flask server as a proxy between the AWS s3 bucket which reduces download time for the user.它还删除了作为 AWS s3 存储桶之间代理的 Flask 服务器,从而减少了用户的下载时间。

def get_attachment_url():
   bucket = 'BUCKET_NAME'
   key = 'FILE_KEY'

   client: boto3.s3 = boto3.client(
     's3',
     aws_access_key_id=YOUR_AWS_ACCESS_KEY,
     aws_secret_access_key=YOUR_AWS_SECRET_KEY
   )

   return client.generate_presigned_url('get_object',
                                     Params={'Bucket': bucket, 'Key': key},
                                     ExpiresIn=60) `

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM