简体   繁体   中英

Delete objects from S3 by comparing the lastmodified date to current date

I want to delete s3 objects that were uploaded yesterday. I want to run an AWS Lambda function every day that will delete the objects uploaded the previous day.

I found a sample code on another thread and tried using that but I get an error saying:

{ "errorMessage": "can't compare offset-naive and offset-aware datetimes", "errorType": "TypeError"}

I am based in Sydney and S3 shows lastmodified as per local timezone but the Lambda function returns UTC time zone. The code I found contains a static date, but I would like to have a dynamic comparison for daily timestamp.

import os

import boto3

from datetime import datetime

bucket = os.environ["S3_BUCKET_NAME"]

s3 = boto3.client('s3', region_name='ap-southeast-2')

response = s3.list_objects_v2(Bucket=bucket)

keys_to_delete = [{'Key': object['Key']} for object in response['Contents'] if object['LastModified'] < datetime(2022, 1, 7)]

s3.delete_objects(Bucket=bucket, Delete={'Objects': keys_to_delete})

Please help to correct this or if anyone knows a better way to accomplish this. I am new to devops and don't have much coding experience.

Thank You.

Use the Storage Lifecycle feature of S3.

It allows to transition the objects to another storage class (Standard, Standard-IA, etc) or to expire (delete) objects. You can create a lifecycle rule to delete the objects 1 day after creation.

When an object reaches the end of its lifetime based on its lifecycle policy, Amazon S3 queues it for removal and removes it asynchronously. There might be a delay between the expiration date and the date at which Amazon S3 removes an object. You are not charged for expiration or the storage time associated with an object that has expired.


Reference:

Managing your storage lifecycle

It appears that your issue is that the LastModified is timezone-aware, but you are comparing it with a datetime that is not timezone-aware.

You could do something like:

import pytz
from pytz import timezone
from datetime import datetime

# Convert time from Sydney into UTC
comparsion_datetime = datetime(2022, 1, 7).astimezone(timezone('Australia/Sydney')).astimezone(pytz.UTC)

This will convert it into the matching UTC timezone.

Then, use comparsion_datetime when calculating keys_to_delete .

In fact, it might work with simply:

comparsion_datetime = datetime(2022, 1, 7).astimezone(timezone('Australia/Sydney'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM