简体   繁体   中英

Move all versions of a given file in a S3 bucket from one folder to another folder

I have set up an S3 bucket with versioning enabled.

One external process is writing the json files, ( each json file corresponds to a single Student entity ) to the S3 bucket.

I have decided the S3 bucket folder structure as follows:

 s3://student-data/new/ <-- THIS WILL CONTAIN ALL THE UNPROCESSED JSON FILES
 s3://student-data/processed/ <-- THIS WILL CONTAIN ALL THE PROCESSED JSON FILES.

Now, I have a Cron that runs periodically, once at every 6 hours.

New JSON files are written to new folder by external process.

I would like the Cron to process all the JSON files with associated versions in new folder and after processing is over, move all the files with all existing versions in new folder to processed folder.

Here I am able to fetch the current version for a json file written to new folder and move this to processed folder post processing.

But I am not getting an idea regarding how can I move a file with all its versions from new to processed so that in the future I don't have to process same version of a file twice.

Objects in Amazon S3 cannot be 'moved'. Rather, they need to be copied to a new key , and then the original object should be deleted .

This process would be more difficult with multiple versions of an object. You would need to copy and delete each version individually , from oldest to newest, to create new versions in the target path. It is not possible to process all versions of an object simultaneously.

Versioning is typically used to retain data that is overwritten. You might want to consider whether versioning is required in your situation, since it complicates the process considerably.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM