Amazon S3 boto - 如何删除文件夹？

Question

I created a folder in s3 named "test" and I pushed "test_1.jpg", "test_2.jpg" into "test".我在 s3 中创建了一个名为“test”的文件夹，并将“test_1.jpg”、“test_2.jpg”推入“test”。

How can I use boto to delete folder "test"?如何使用 boto 删除文件夹“test”？

Answer 1

Here is 2018 (almost 2019) version:这是 2018 年（几乎是 2019 年）版本：

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
bucket.objects.filter(Prefix="myprefix/").delete()

Answer 2

There are no folders in S3. S3 中没有文件夹。 Instead, the keys form a flat namespace.相反，键形成一个平面命名空间。 However a key with slashes in its name shows specially in some programs, including the AWS console (see for example Amazon S3 boto - how to create a folder? ).但是，名称中带有斜杠的键在某些程序中特别显示，包括 AWS 控制台（参见例如Amazon S3 boto - 如何创建文件夹？）。

Instead of deleting "a directory", you can (and have to) list files by prefix and delete.您可以（并且必须）按前缀和删除列出文件，而不是删除“目录”。 In essence:在本质上：

for key in bucket.list(prefix='your/directory/'):
    key.delete()

However the other accomplished answers on this page feature more efficient approaches.但是，此页面上的其他已完成答案具有更有效的方法。

Notice that the prefix is just searched using dummy string search.请注意，只是使用虚拟字符串搜索来搜索前缀。 If the prefix was ~~your/directory~~ , that is, without the trailing slash appended, the program would also happily delete your/directory-that-you-wanted-to-remove-is-definitely-not-t‌his-one .如果前缀是~~your/directory~~ ，即没有附加斜杠，程序也会很乐意删除your/directory-that-you-wanted-to-remove-is-definitely-not-t‌his-one 。

For more information, see S3 boto list keys sometimes returns directory key.有关更多信息，请参阅S3 boto 列表键有时会返回目录键。

Answer 3

I feel that it's been a while and boto3 has a few different ways of accomplishing this goal.我觉得已经有一段时间了，boto3 有几种不同的方式来实现这个目标。 This assumes you want to delete the test "folder" and all of its objects Here is one way:这假设您要删除测试“文件夹”及其所有对象这是一种方法：

s3 = boto3.resource('s3')
objects_to_delete = s3.meta.client.list_objects(Bucket="MyBucket", Prefix="myfolder/test/")

delete_keys = {'Objects' : []}
delete_keys['Objects'] = [{'Key' : k} for k in [obj['Key'] for obj in objects_to_delete.get('Contents', [])]]

s3.meta.client.delete_objects(Bucket="MyBucket", Delete=delete_keys)

This should make two requests, one to fetch the objects in the folder, the second to delete all objects in said folder.这应该发出两个请求，一个是获取文件夹中的对象，第二个是删除该文件夹中的所有对象。

https://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.delete_objects https://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.delete_objects

Answer 4

You can use bucket.delete_keys() with a list of keys (with a large number of keys I found this to be an order of magnitude faster than using key.delete).您可以将bucket.delete_keys()与键列表一起使用（对于大量键，我发现这比使用key.delete 快一个数量级）。

Something like this:像这样的东西：

delete_key_list = []
for key in bucket.list(prefix='/your/directory/'):
    delete_key_list.append(key)
    if len(delete_key_list) > 100:
        bucket.delete_keys(delete_key_list)
        delete_key_list = []

if len(delete_key_list) > 0:
    bucket.delete_keys(delete_key_list)

Answer 5

A slight improvement on Patrick's solution.帕特里克的解决方案略有改进。 As you might know, both list_objects() and delete_objects() have an object limit of 1000. This is why you have to paginate listing and delete in chunks.您可能知道， list_objects()和delete_objects()的对象限制都是 1000。这就是为什么您必须对列表进行分页并分块删除。 This is pretty universal and you can give Prefix to paginator.paginate() to delete subdirectories/paths这是非常普遍的，你可以给paginator.paginate()加上Prefix来删除子目录/路径

client = boto3.client('s3', **credentials)
paginator = client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=self.bucket_name)

delete_us = dict(Objects=[])
for item in pages.search('Contents'):
    delete_us['Objects'].append(dict(Key=item['Key']))

    # flush once aws limit reached
    if len(delete_us['Objects']) >= 1000:
        client.delete_objects(Bucket=bucket, Delete=delete_us)
        delete_us = dict(Objects=[])

# flush rest
if len(delete_us['Objects']):
    client.delete_objects(Bucket=bucket, Delete=delete_us)

Answer 6

If versioning is enabled on the S3 bucket:如果在 S3 存储桶上启用了版本控制：

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
bucket.object_versions.filter(Prefix="myprefix/").delete()

Answer 7

If one needs to filter by object contents like I did, the following is a blueprint for your logic:如果需要像我一样按对象内容过滤，以下是您的逻辑蓝图：

def get_s3_objects_batches(s3: S3Client, **base_kwargs):
    kwargs = dict(MaxKeys=1000, **base_kwargs)
    while True:
        response = s3.list_objects_v2(**kwargs)
        # to yield each and every file: yield from response.get('Contents', [])
        yield response.get('Contents', [])
        if not response.get('IsTruncated'):  # At the end of the list?
            break
        continuation_token = response.get('NextContinuationToken')
        kwargs['ContinuationToken'] = continuation_token


def your_filter(b):
   raise NotImplementedError()


session = boto3.session.Session(profile_name=profile_name)
s3client = session.client('s3')
for batch in get_s3_objects_batches(s3client, Bucket=bucket_name, Prefix=prefix):
    to_delete = [{'Key': obj['Key']} for obj in batch if your_filter(obj)]
    if to_delete:
        s3client.delete_objects(Bucket=bucket_name, Delete={'Objects': to_delete})

Answer 8

def remove(path):
    session = boto3.Session(
        aws_access_key_id = config["aws_access_key_id"],
        aws_secret_access_key = config["aws_secret_access_key"],
        region_name=config["region_name"],
    )
    s3 = session.client('s3')
    bucket = config["bucketName"]

    try:
        result = s3.delete_object(Bucket = bucket, Key=path)
    except Exception as e:
        print(e)

Answer 9

you can do it using aws cli : https://aws.amazon.com/cli/ and some unix command. 你可以使用aws cli来做到这一点： https ：//aws.amazon.com/cli/和一些unix命令。

this aws cli commands should work: 这个aws cli命令应该工作：

aws s3 rm <your_bucket_name> --recursive --exclude "*" --include "<your_regex>"

if you want to include sub-folders you should add the flag --recursive 如果你想包含子文件夹，你应该添加标志--recursive

or with unix commands: 或使用unix命令：

aws s3 ls s3://<your_bucket_name>/ | awk '{print $4}' | xargs -I%  <your_os_shell>   -c 'aws s3 rm s3:// <your_bucket_name>  /% $1'

explanation: 说明：

list all files on the bucket --pipe--> 列出存储桶上的所有文件--pipe - >
get the 4th parameter(its the file name) --pipe--> // you can replace it with linux command to match your pattern 获取第4个参数（它的文件名） --pipe - > //你可以用linux命令替换它以匹配你的模式
run delete script with aws cli 用aws cli运行删除脚本

Amazon S3 boto - 如何删除文件夹？

问题描述

7 个解决方案

解决方案1
249 2018-12-18 15:18:49

解决方案2
65 已采纳 2012-07-11 07:31:21

解决方案3
49 2016-01-19 22:23:24

解决方案4
22 2013-04-11 13:13:38

解决方案5
22 2017-04-16 11:32:00

解决方案6
4 2020-08-05 13:33:41

解决方案7
1 2020-12-16 17:25:30

解决方案8
0 2021-05-14 18:51:28

解决方案9
-1 2019-03-17 10:02:44

Amazon S3 boto - 如何删除文件夹？

问题描述

7 个解决方案

解决方案1 249 2018-12-18 15:18:49

解决方案2 65 已采纳 2012-07-11 07:31:21

解决方案3 49 2016-01-19 22:23:24

解决方案4 22 2013-04-11 13:13:38

解决方案5 22 2017-04-16 11:32:00

解决方案6 4 2020-08-05 13:33:41

解决方案7 1 2020-12-16 17:25:30

解决方案8 0 2021-05-14 18:51:28

解决方案9 -1 2019-03-17 10:02:44

解决方案1
249 2018-12-18 15:18:49

解决方案2
65 已采纳 2012-07-11 07:31:21

解决方案3
49 2016-01-19 22:23:24

解决方案4
22 2013-04-11 13:13:38

解决方案5
22 2017-04-16 11:32:00

解决方案6
4 2020-08-05 13:33:41

解决方案7
1 2020-12-16 17:25:30

解决方案8
0 2021-05-14 18:51:28

解决方案9
-1 2019-03-17 10:02:44