简体   繁体   English

在 Amazon S3 中删除文件的最快方法

[英]Fastest way to delete files in Amazon S3

With boto3, one can delete files in a bucket as below使用 boto3,可以删除存储桶中的文件,如下所示

for object in bucket.objects.all():
    if 'xyz' in object.key:
        object.delete()

This sends one REST API call per file.这会为每个文件发送一个 REST API 调用。 If you have a large number of files, this can take a long time.如果您有大量文件,这可能需要很长时间。

Is there a faster way to do this?有没有更快的方法来做到这一点?

The easiest way to delete files is by using Amazon S3 Lifecycle Rules .删除文件的最简单方法是使用Amazon S3 生命周期规则 Simply specify the prefix and an age (eg 1 day after creation) and S3 will delete the files for you!只需指定前缀和年龄(例如创建后 1 天),S3 就会为您删除文件!

However, this is not necessarily the fastest way to delete them -- it might take 24 hours until the rule is executed.但是,这不一定是删除它们的最快方法 - 执行规则可能需要 24 小时。

If you really want to delete the objects yourself, use delete_objects() instead of delete_object() .如果您真的想自己删除对象,请使用delete_objects()而不是delete_object() It can accept up to 1000 keys per call, which will be faster than deleting each object individually.每次调用最多可以接受 1000 个键,这比单独删除每个对象要快。

There are many ways to accomplish what you are asking. 有很多方法可以满足您的要求。

Use Python's list comprehension, to get the list of objects that meet your criteria: 使用Python的列表推导来获取满足您条件的对象的列表:

myobjects = [{'Key':obj.key} for obj in bucket.objects.all() if 'xyz' in obj.key]

Once you store the objects to be deleted in myobjects , call bulk delete : 将要删除的对象存储在myobjects ,调用批量删除

bucket.delete_objects(Delete={ 'Objects': myobjects})

delete_objects(**kwargs) delete_objects(** kwargs)

This operation enables you to delete multiple objects from a bucket using a single HTTP request. 通过此操作,您可以使用单个HTTP请求从存储桶中删除多个对象。 You may specify up to 1000 keys. 您最多可以指定1000个键。

If there are more than 1000 keys, then it is a matter looping through the list, slice 1000 keys in each iteration and call delete_objects() 如果有1000个以上的键,则循环遍历列表,在每次迭代中切片1000个键并调用delete_objects()

Boto provides support for MultiDelete. Boto 提供对 MultiDelete 的支持。 Here's an example of how you would use it:以下是您将如何使用它的示例:

import boto.s3
conn = boto.s3.connect_to_region('us-east-1')  # or whatever region you want
bucket = conn.get_bucket('mybucket')
keys_to_delete = ['mykey1', 'mykey2', 'mykey3', 'mykey4']
result = bucket.delete_keys(keys_to_delete)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM