简体   繁体   English

从特定的“子目录”下载多个文件,AWS S3 with boto3 & Python 3.7

[英]Download multiple files from specific “subdirectory”, AWS S3 with boto3 & Python 3.7

import boto3
import os 

client = boto3.client('connect')

s3 = boto3.resource(
    service_name='s3',
    region_name='us-west-2',
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key
)

   
for my_bucket_object in s3.Bucket("my_bucket").objects.filter(Prefix="user/folder/"):
    s3.Object(my_bucket_object.bucket_name, my_bucket_object.key).download_file(f'./aws/{my_bucket_object.key}')
  1. Without iteration, but similar code, I can successfully download individual files.无需迭代,但类似的代码,我可以成功下载单个文件。
  2. Without downloading, printing the bucket keys shows normal outputs无需下载,打印桶键显示正常输出

However, when I iterate over multiple files, and use the key as input for download_file, I get the following error message.但是,当我遍历多个文件并将密钥用作 download_file 的输入时,我收到以下错误消息。 Target key's name seems to be changing?目标键的名称似乎正在改变?

FileNotFoundError: [Errno 2] No such file or directory: './aws/user/folder\.7g4DBa9A' FileNotFoundError:[Errno 2] 没有这样的文件或目录:'./aws/user/folder\.7g4DBa9A'

I have the following two questions:我有以下两个问题:

  1. How can I prevent this from happening and download the files?如何防止这种情况发生并下载文件?
  2. Is there a way to separate file names from "subdirectories" (I realize AWS doesn't use those, but keys contain directory/file-like names separated only by "/", I would like to separate those for saving purposes)有没有办法将文件名与“子目录”分开(我意识到 AWS 不使用这些,但键包含仅由“/”分隔的目录/类似文件的名称,我想将它们分开以用于保存目的)

=========================================================================== Found the answer thanks to Marcin's comment. ==================================================== ========================= 由于 Marcin 的评论,找到了答案。 After iteratively printing all the outputs, it seemed the first one was the "folder", which translated to strange names when downloading.反复打印所有输出后,似乎第一个是“文件夹”,下载时翻译成奇怪的名称。
ie. IE。
user/folder/用户/文件夹/
user/folder/file1用户/文件夹/文件1
user/folder/file2用户/文件夹/文件2
etc.等等

Thus, ignoring that first iteration was able to solve it.因此,忽略第一次迭代能够解决它。

for obj in my_bucket.objects.filter(Prefix=prefix):
       
    output_file = obj.key.split('/')[-1]

    if output_file == "":
        continue
    else:
        s3.Object(bucket_name=my_bucket.name, key=my_bucket_object.key).download_file(arbitrary output path)

is there a way to separate file names from "subdirectories"有没有办法将文件名与“子目录”分开

You can split the key by / and take the last element before you do download_file :您可以通过/拆分密钥并在执行download_file之前获取最后一个元素:

output_file = my_bucket_object.key.split('/')[-1]
s3.Object(my_bucket_object.bucket_name, my_bucket_object.key).download_file(f'./aws/output_file')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM