简体   繁体   中英

How to utilize shell script and AWS CLI to automatically copy a file daily from one S3 bucket to another?

I'd like to create a way (using shell scripts and AWS's CLI) so that the following can be automated:

  1. Copy specific files from an s3 bucket
  2. Paste them into a different bucket in S3.

Would the below 'sync' command work?

aws s3 sync s3://directory1/bucket1 s3://directory2/bucket2 --exclude "US*.gz" --exclude "CA*.gz" --include "AU*.gz"

The goal here is to ONLY transfer files whose filenames begin with "AU" and exclude everything else, all in automated fashion as much as possible. Also, is it possible to exclude very old files?

Second part of the question is what do I need to add to my shell script in order to automate this process as much as possible, as "AU" files gets dropped in this folder everyday?

Copy objects

The AWS CLI can certainly copy objects between buckets. In fact, it does not even require files to be downloaded — S3 will copy directly between buckets, even if they are in different regions.

The aws s3 sync command is certainly an easy way to do it, since it will replicate any files from the source to the destination without having to specifically state which files to copy.

To only copy AU* files, use: --exclude "*" --include "AU*"

See: Use of Exclude and Include Filters

You asked about excluding old files — the sync command will sync all files, so any files that were previously copied will not be copied again. By default, any files deleted from the source will not be deleted in the destination until specifically requested.

Automate

How to automate this? The most cloud-worthy way to do this would be to create an AWS Lambda function . The Lambda function can be automatically triggered by an Amazon CloudWatch Events rule on a regular schedule.

However, the AWS CLI is not installed by default in Lambda, so it might be a little more challenging. See: Running aws-cli Commands Inside An AWS Lambda Function - Alestic.com

It would be better to have the Lambda function do the copy itself, rather than calling the AWS CLI.

Alternative idea

Amazon S3 can be configured to trigger an AWS Lambda function whenever a new object is added to an S3 bucket . This way, as soon as the object is added in S3, it will be copied to the other Amazon S3 bucket. Logic in the Lambda function can determine whether or not to copy the file, such as checking that is starts with AU .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM