[英]What is the issue with the python script to delete contents of multiple s3 buckets concurrently and wait till it gets deleted?
I am trying to create a python script to delete the contents of 6 s3 buckets simultaneously, wait till all the data gets deleted, and handle more than 1000 objects in a bucket.我正在尝试创建一个 python 脚本来同时删除 6 个 s3 存储桶的内容,等待所有数据被删除,并处理一个存储桶中的 1000 多个对象。 However, I am randomly encountering the error "KeyError: 'endpoint_resolver'".
但是,我随机遇到错误“KeyError:'endpoint_resolver'”。 I have set the AWS configuration correctly as I can list the S3 buckets by running the AWS command.
我已正确设置 AWS 配置,因为我可以通过运行 AWS 命令列出 S3 存储桶。 Can you help me resolve this issue?
你能帮我解决这个问题吗?
The code I have written is as follows:我写的代码如下:
import boto3
import concurrent.futures
def delete_s3_bucket_contents(bucket_name):
sess = boto3.session.Session()
s3 = sess.client('s3')
bucket = boto3.resource('s3').Bucket(bucket_name)
objects_to_delete = [{'Key': obj.key} for obj in bucket.objects.all()]
while objects_to_delete:
response = s3.delete_objects(
Bucket=bucket_name,
Delete={
'Objects': objects_to_delete[:1000],
'Quiet': True
}
)
objects_to_delete = objects_to_delete[1000:]
def delete_multiple_buckets(bucket_names, max_workers=6):
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(delete_s3_bucket_contents, bucket) for bucket in bucket_names]
concurrent.futures.wait(futures)
for future in concurrent.futures.as_completed(futures):
future.result()
bucket_names = ["A","B","C","D","E","F"]
delete_multiple_buckets(bucket_names)
I also tried to delete data from the above 6 buckets simultaneously in bash.我也试过在bash同时删除上面6个桶的数据。
parallel -j 6 "aws s3api list-objects --bucket {} --query '{Contents: [Contents[].{Key: Key}]}' --output json | jq -r '.Contents[].Key' | xargs -I {} -n 1000 aws s3api delete-objects --bucket {} --delete '{\"Objects\":[{\"Key\":\"{}\"}],\"Quiet\":true}' " ::: "${destination_buckets[@]}"
but it was throwing jq error但它抛出 jq 错误
jq: error (at <stdin>:189150): Cannot index array with string "Key"
I can run aws rm command but it very slow in deletion我可以运行 aws rm 命令,但删除速度很慢
Following the general example , you can try to create session and use it to make the client in your delete_multiple_buckets
function and just pass the client to your worker.按照一般示例,您可以尝试创建 session 并使用它在您的
delete_multiple_buckets
function 中创建客户端,然后将客户端传递给您的工作人员。
import boto3.session
from concurrent.futures import ThreadPoolExecutor
def delete_s3_bucket_contents(client, bucket_name):
# Put your thread-safe code here
def delete_multiple_buckets(bucket_names, max_workers=6):
# Create a session and use it to make our client
session = boto3.session.Session()
s3_client = session.client('s3')
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
# just pass your client as an argument
futures = [executor.submit(delete_s3_bucket_contents, s3_client, bucket) for bucket in bucket_names]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.