简体   繁体   中英

ShellCommandActivity in AWS Data Pipeline

I am transferring Dynamo DB data to S3 using Data Pipeline. In the S3 bucket I get the backup but it is split into multiple files. To get the data in a single file I used a Shell Command Activity which runs the following command:

aws s3 cat #{myOutputS3Loc}/#{format(@scheduledStartTime,'YYYY-MM-dd')}/* > #{myRenamedFile}

This should concatenate all the files present in the S3 folder to a single file named #{myRenamedFile} . But I get the following error in data pipeline:

usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters] To see help text, you can run: aws help aws <command> help aws <command> <subcommand> help aws: error: argument subcommand: Invalid choice, valid choices are: ls | website cp | mv rm | sync mb | rb

Does this mean cat is not supported in Shell Command Activity or is there something wrong here? Is there any other method to combine the different files to a single file in S3 itself?

There is no cat command in aws s3 . Other options:

  • cp/sync the files and catenate all the files using cat command in shell
  • Get the file names and loop through the list by calling aws s3 cp s3://<file> - and append the output to a new file. You can do this in a single command with --recursive option to cp but --recursive is not supported if the file is copied to stdout

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM