简体   繁体   中英

How to get a downloadable url of S3 bucket itself not an object url using python, boto3?

I have lots of files and sub-folders inside of which there are some folders and files all of this is inside an S3 bucket. So I know how to download the files as there is a object url clicking on that, we will be able to download the file.

Requirement

But my requirement is such that I need a downloadable url of the S3 bucket such that clicking on it, I will be able to download all the contents intact like files, sub-folders etc in the bucket as it is.

import os, boto3, params, subprocess

path  = "C:\\Users\\lenovo\\Desktop\\BackUp"

subprocess.run(['aws', 's3', 'sync', path, 's3://axis-tax-drive'])

I wrote this code to upload the content to S3, now I'd like to get a downloadable url of the S3 bucket as I mentioned above.

Could the requirement be satisified by creating an access point or anything like that..

I'd like to know all the possibilities.

Please Help..

Thanks in advance.

It is not possible to "download a bucket from a URL". The API calls for Amazon S3 can only download a single object. Nor is it possible to ask S3 to provide a Zip of multiple files stored in S3.

However, you could use the AWS Command-Line Interface (CLI) to do it...

Your code shows an example of using the AWS CLI:

aws s3 sync <path> s3://bucketname

The AWS CLI is a Python program that calls the S3 API. For the above command, it lists the contents of path and then uses a loop to call the PutObject() command to upload one file at a time. However, it's a little bit smart because it uses multi-threading to upload multiple files at the same time (but each upload is done by a separate API call).

You could use the same command in reverse to download a bucket to your computer:

aws s3 sync s3://bucketname <path>

Or, you could write your own program that loops through the files and downloads them individually.

This would be the API Gateway / Lambda approach.

Suppose you have an Lambda which downloads all objects from your S3 bucket and puts it into a zip like this:

import logging
import os
from io import BytesIO
from typing import Dict, Any
from zipfile import ZipFile

import boto3

LOGGER = logging.getLogger("zip-bucket")
logging.basicConfig(level="INFO",)
'''
  this is the method which is invoked by the lambda. if you upload it AWS lambda, this is your method which will be called.
  not quite sure about what is in the event, but I do not need it for now.
'''
def handle(event: Dict[str, Any], context: Dict[str, Any]):
    s3 = boto3.client("s3")
    bucket = os.environ["BUCKET"]
    s3_objects = s3.list_objects(Bucket=bucket)
    zip_buffer = BytesIO()
    with ZipFile(zip_buffer, 'w') as myzip:
        for s3_object in s3_objects['Contents']:
            key = s3_object['Key']
            LOGGER.info("Zipping %s", key)
            myzip.writestr(key, s3.get_object(Bucket=bucket, Key=key)['Body'].read())
    myzip.close()
    return {
            'headers': { "Content-Type": " application/zip; charset=binary" },
            'statusCode': 200,
            'body': zip_buffer.getvalue(),
            'isBase64Encoded': False
        }



if __name__ == '__main__':
    zipfile = open("result.zip", "wb")
    zipfile.write(handle(None, None)['body'])
    zipfile.close()

You can run it locally for testing or you create an AWS lambda. Since the only dependency is the aws-sdk (boto3) which is provided it runs out of the box. Keep in mind, you have to give the lambda execution role permissions to your s3 bucket (ListBucket and GetObject). Now your request handling logic is in place. What is missing now is some kind of HTTP endpoint. Here API Gateway comes into play.

In AWS go to API Gateway and create I suppose a HTTP Gateway, because this is the most simple one. You can connect previously created lambda when creating a new api. If you do it via the AWS Console, it will create policies in your lambda on the fly. After the you get an http endpoint where you can download the zip. If it does not work you need to look into the Cloudwatch logs either or the lambda or your api gateway.

This link might be interesting for you: https://docs.aws.amazon.com/apigateway/latest/developerguide/lambda-proxy-binary-media.html

Keep in mind that is approach has no authentication build in!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM