简体   繁体   中英

Use Python to download s3 objects to a arbitrarily defined local directory

I have a Python function (see below) which iteratively downloads objects from a remote s3 directory and stores them in a local folder.

In current state, the files go here.

AnalysisOutput/file1
AnalysisOutput/file2
AnalysisOutput/file3

AnalysisOutput is the name of the remote bucket. I don't want that directory to be hard-coded on my local instance. Instead, I want them to go here:

tempS3output/file1
tempS3output/file2
tempS3output/file3
def downloadDirectoryFroms3(bucketName,remoteDirectoryName):
    s3_resource = boto3.resource('s3')
    bucket = s3_resource.Bucket(bucketName)
    number = 0
    for object in bucket.objects.filter(Prefix = remoteDirectoryName):
        number = number + 1
        if not os.path.exists(os.path.dirname(object.key)):
            os.makedirs(os.path.dirname(object.key))
        bucket.download_file(object.key,object.key)

downloadDirectoryFroms3('reciter-dynamodb', 'AnalysisOutput')

You could use removeprefix from the standard lib to change the directory where it is downloaded. Your function would become:

def download_directory_from_s3(bucket_name, remote_directory_name, local_directory_name):
    s3_resource = boto3.resource('s3')
    bucket = s3_resource.Bucket(bucket_name)
    number = 0
    for object in bucket.objects.filter(Prefix=remote_directory_name):
        number = number + 1
        if not os.path.exists(os.path.dirname(object.key)):
            os.makedirs(os.path.dirname(object.key))
        local_path = f"{local_directory_name}/{object.key.removeprefix(remote_directory_name)}"
        bucket.download_file(object.key, object.key)

download_directory_from_s3('reciter-dynamodb', 'AnalysisOutput')

Assuming object.key is a string, if not convert it to string with something like str(object.key) for instance.

The comment from @Methacrylon was very helpful. I made a couple tweaks to make this work for me.

download_directory_from_s3('reciter-dynamodb', 'AnalysisOutput','temp')

def download_directory_from_s3(bucket_name, remote_directory_name, local_directory_name):
    s3_resource = boto3.resource('s3')
    bucket = s3_resource.Bucket(bucket_name)
    number = 0
    for object in bucket.objects.filter(Prefix=remote_directory_name):
        number = number + 1
        print(object)        
        local_path = f"{local_directory_name}/"
        file_name = local_path + "/" + object.key.removeprefix(remote_directory_name)
        bucket.download_file(object.key, file_name)    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM