简体   繁体   English

如何在 S3 boto3 中删除多个文件和特定模式

[英]How to delete multiple files and specific pattern in S3 boto3

Can Python delete specific multiple files in S3? Python可以删除S3中的特定多个文件吗?

I want to delete multiple files with specific extensions.我想删除具有特定扩展名的多个文件。

This script removes all files.此脚本删除所有文件。

These are the various specific files that I want to delete:这些是我要删除的各种特定文件:

XXX.tar.gz
XXX.txt

** Current code: ** (all files deleted) ** 当前代码:**(所有文件已删除)

import boto3

accesskey = "123"
secretkey = "123"
region = "ap-northeast-1"

s3 = boto3.resource ('s3', aws_access_key_id = accesskey, aws_secret_access_key = secretkey, region_name = region)

bucket = s3.Bucket ('test')
files = [os.key for os in bucket.objects.filter (Prefix = "myfolder / test /")]
tar_files = [file to file in files if file.endswith ('tar.gz')]

#print (f'All files: {files} ')
#print (f'CSV files: {csv_files} ')

objects_to_delete = s3.meta.client.list_objects (Bucket = "test", Prefix = "myfolder / test /")

delete_keys = {'Objects': []}
delete_keys ['Objects'] = [{'Key': tar_files} for tar_files in [obj ['Key'] for obj in objects_to_delete.get ('Content', [])]]

s3.meta.client.delete_objects (Bucket = "test", Delete = delete_keys)

If anyone knows, please let me know.如果有人知道,请告诉我。

Presuming that you want to delete *.tar.gz and *.txt files from the given bucket and prefix, this would work:假设您要从给定的存储桶和前缀中删除*.tar.gz*.txt文件,这将起作用:

import boto3

s3_resource = boto3.resource('s3')

bucket = s3_resource.Bucket('my-bucket')
objects = bucket.objects.filter(Prefix = 'myfolder/')

objects_to_delete = [{'Key': o.key} for o in objects if o.key.endswith('.tar.gz') or o.key.endswith('.txt')]

if len(objects_to_delete):
    s3_resource.meta.client.delete_objects(Bucket='my-bucket', Delete={'Objects': objects_to_delete})
  1. Iterate over your S3 buckets迭代您的 S3 存储桶
  2. For each bucket, iterate over the files对于每个存储桶,遍历文件
  3. Delete the requested file types删除请求的文件类型
import boto3
s3 = boto3.resource('s3')
   
   for bucket in s3.meta.client.list_buckets()['Buckets']:
       for count, obj in enumerate(s3.Bucket(bucket['Name']).objects.filter()):
           if obj.key.endswith('.tar.gz') or obj.key.endswith('.txt'):
               print("{}: deleting: {} from: {}".format(count, obj.key, bucket['Name']))
               s3.meta.client.delete_object(Bucket=bucket['Name'], Key=obj.key)

I used this method since, for large buckets creating the complete list and then deleting it can take some time.我使用这种方法是因为,对于大型存储桶,创建完整列表然后删除它可能需要一些时间。 This way, you see the progress.这样,您就可以看到进度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM