简体   繁体   中英

AWS lambda function to retrieve any uploaded files from s3 and upload the unzipped folder back to s3 again

I have an s3 bucket which is used for users to upload zipped directories, often 1GB in size. The zipped directory holdes images in subfolders and more.

I need to create a lambda function, that will get triggered upon new uploads, unzip the file, and upload the unzipped content back to an s3 bucket, so I can access the individual files via http - but I'm pretty clueless as to how I can write such a function?

My concerns are:

  • Pyphon or Java is probably better performance over nodejs?
  • Avoid running out of memory, when unzipping files of a GB or more (can I stream the content back to s3?)

The AWS Lambda FAQ states:

Each Lambda function receives 500MB of non-persistent disk space in its own /tmp directory.

This will be insufficient for storing the 1GB zip file, plus the unzipped contents.

You would need to stream the 'input' zip file in ranges, and store the unzipped files in small groups to avoid this problem. It is probably not worthwhile using Lambda for this application.

AWS Lambda is currently limited to 5 minutes of execution time.

If your runtime of download+unzip+upload takes more than 5 minutes, Lambda will not work for you.

Here's a reference for some solution: https://serifandsemaphore.io/aws-lambda-going-beyond-5-minutes-34e381e71231#.maz4sfo43

Lambda would not be a good fit for the actual processing of the files for the reasons mentioned by other posters. However, since it integrates with S3 events it could be used as a trigger for something else. It could send a message to SQS where another process that runs on EC2 (ECS, ElasticBeanstalk, ECS) could handle the messages in the queue and then process the files from S3.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM