简体   繁体   中英

How do I delete all except the latest 5 recently updated/new files from AWS s3?

I can fetch the last five updated files from AWS S3 using the below command

aws s3 ls s3://somebucket/ --recursive | sort | tail -n 5 | awk '{print $4}'

Now I need to delete all the files in AWS S3 except the last 5 files which are fetched from above command in AWS.

Say the command fetches 1.txt,2.txt,3.txt,4.txt,5.txt . I need to delete all from AWS S3 except 1.txt,2.txt,3.txt,4.txt,and 5.txt .

Use AWS s3 rm command with multiple --exclude options (I assume the last 5 files do not fall under a pattern)

aws s3 rm s3://somebucket/ --recursive --exclude "somebucket/1.txt" --exclude "somebucket/2.txt" --exclude "somebucket/3.txt" --exclude "somebucket/4.txt" --exclude "somebucket/5.txt"

CAUTION : Make sure you try it with --dryrun option, verify the files to be deleted do not include the 5 files before actually removing the files.

Use a negative number with head to get all but the last n lines:

aws s3 ls s3://somebucket/ --recursive | sort | head -n -5 | while read -r line ; do
    echo "Removing ${line}"
    aws s3 rm s3://somebucket/${line}
done

I combined a number of solutions and came up with this to remove all but the last 30 files. Note that two sorts are needed to sort by both date and time. This also handles files with spaces.

aws s3 ls s3://your-bucket/ --recursive | sort -k1 | sort -k2 | head -n -30 | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//' | while read -r line ; do
    echo "Removing \"${line}\"";
    aws s3 rm "s3://your-bucket/${line}";
done

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM