简体   繁体   English

使用python boto仅下载S3存储桶中的特定文件夹

[英]Download only specific folder in S3 bucket using python boto

The link below shows how to download an entire S3 content. 下面的链接显示了如何下载整个S3内容。 However, how does one get subfolder content. 但是,如何获得子文件夹的内容。 Suppose my S3 folder has the following emulated structure. 假设我的S3文件夹具有以下模拟结构。

S3Folder/S1/file1.c S3Folder / S1 / file1.c

S3Folder/S1/file2.h S3Folder / S1 / file2.h

S3Folder/S1/file1.h S3Folder / S1 / file1.h

S3Folder/S2/file.exe S3Folder / S2 / file.exe

S3Folder/S2/resource.data S3Folder / S2 / resource.data

Suppose I am interested only in S2 folder. 假设我只对S2文件夹感兴趣。 How do I isolate the keys in bucket list ? 如何隔离存储桶列表中的键?

local backup of an S3 content S3内容的本地备份

conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(bucket_name)

# go through the list of files
bucket_list = bucket.list()
for l in bucket_list:
  keyString = str(l.key)
  d = LOCAL_PATH + keyString
  try:
    l.get_contents_to_filename(d)
  except OSError:
    # check if dir exists
    if not os.path.exists(d):
      os.mkdir(d)

You can download s3 objects by adding prefix of it in the key value. 您可以通过在键值中添加s3对象的前缀来下载它。

So, according to your Question , you just need to add prefix '/S2' while downloading objects 因此,根据您的Question,您只需要在下载对象时添加前缀'/ S2'

FYI: s3 download object using boto3 仅供参考:使用boto3的s3下载对象

For more check this 欲了解更多检查

You could do the following: 您可以执行以下操作:

import os
import boto3
s3_resource = boto3.resource("s3", region_name="us-east-1")

    def download_objects():
        root_dir = 'D:/' # local machine location
        s3_bucket_name = 'S3_Bucket_Name' #s3 bucket name
        s3_root_folder_prefix = 'sample' # bucket inside root folder
        s3_folder_list = ['s3_folder_1','s3_folder_2','s3_folder_3'] # root folder inside sub folders list

        my_bucket = self.s3_resource.Bucket(s3_bucket_name)
        for file in my_bucket.objects.filter(Prefix=s3_root_folder_prefix):
            if any(s in file.key for s in s3_folder_list):
                try:
                    path, filename = os.path.split(file.key)
                    try:
                        os.makedirs(root_dir + path)
                    except Exception as err:
                        pass
                    my_bucket.download_file(file.key, root_dir + path + '/' + filename)
                except Exception as err:
                    print(err)

if __name__ == '__main__':
    download_objects()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM