简体   繁体   English

Python 3 + Boto3 + S3:下载文件夹中的所有文件

[英]Python 3 + boto3 + s3: download all files in a folder

I am writing a Python 3.4 + boto3 script to download all files in an s3 bucket/folder. 我正在编写Python 3.4 + boto3脚本来下载s3存储桶/文件夹中的所有文件。 I'm using s3.resource rather than client because this EMR cluster already has the key credentials. 我使用的是s3.resource而不是客户端,因为此EMR群集已具有密钥凭据。

This works to download a single file: 这适用于下载单个文件:

s3 = boto3.resource('s3')
bucket = "my-bucket"
file = "some_file.zip"
filepath = "some_folder/some_file.zip"


def DL(bucket, key, local_name):
    s3.Bucket(bucket).download_file(key, local_name)

DL(bucket, filepath, file)

But I need to download all files in a folder within the bucket, which have a format like so: 但是我需要将所有文件下载到存储桶中的某个文件夹中,其格式如下:

some_file_1.zip
some_file_2.zip
some_file_3.zip, etc.

It should be simple but I guess we can't use a wildcard or pattern match like "some_file*". 它应该很简单,但是我想我们不能使用通配符或模式匹配,例如“ some_file *”。 So I have to loop through and find each file name? 所以我必须遍历并找到每个文件名?

And call download_file for each file name? 并为每个文件名称调用download_file?

You can use listobjectsv2 and pass the prefix to only get the keys inside your s3 "folder". 您可以使用listobjectsv2并传递前缀以仅获取s3“文件夹”中的键。 Now you can use a for loop to go through all these keys and download them all. 现在,您可以使用for循环浏览所有这些键并全部下载。 Use conditions if you need to filter them more. 如果需要进一步过滤,请使用条件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM