简体繁体中英

How do I trigger a AWS lambda function only if bulk upload finished on S3?

原文 2022-07-07 01:14:58 4 1 amazon-web-services/ amazon-s3/ aws-lambda

We have a simple ETL setup below

Vendor upload crawled parquet data to our S3 bucket.
S3 event trigger a lambda function, which will trigger a glue crawler to update the existing table partition in glue.

This works fine most of the times, but in some cases our vendor might upload files consecutively in a short time period, for example when refreshing history data. This will cause an issue since glue crawler cannot run concurrently and the job will fail.

I'm wondering if there is anything we can do to avoid the potential error. I've looked into SQS but not exactly sure if this can help me, below is what I would like to achieve:

Vendor upload file to S3.
S3 send event to SQS.
SQS hold the event, wait until there has been no other following event for a given time period, say 5 minutes.
After no further event in 5 minutes, SQS trigger the lambda function to run the glue crawler.

Is this doable with S3 and SQS?

1 answers

SQS hold the event,

Yes, you can do this, as you can setup SQS delay to up to 15 minues.

wait until there has been no other following event for a given time period, say 5 minutes.

No, there is not automated way for that. You have to develop your own custom solution . The most trivial way would be to not bundle SQS with lambda, and instead have lambda running on schedule (eg every 5 minutes). Lambda would have to have logic to determine if there are no new files uploaded after some time, and then trigger your Glue Job. Probably this would involve DynamoDB to keep track of last uploaded files between lambda executions.

How do I use AWS Lambda to trigger Comprehend with S3?

AWS - want to upload multiple files to S3 and only when all are uploaded trigger a lambda function

How do I add a Lambda Function with an S3 Trigger in CloudFormation?

Unable to trigger AWS Lambda by upload to AWS S3

How to trigger a Lambda on S3 file upload

How do I ensure that a Lambda function writes to Cloudwatch Logs on every S3 Put Trigger after deploy?

How to add s3 trigger to lambda function?

Trigger lambda function in a certain time range from s3 upload

How do I specify an existing Lambda function's alias as a DynamoDB trigger using the AWS CDK?

How to upload excel file to AWS S3 using an AWS lambda function in python

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How do I use AWS Lambda to trigger Comprehend with S3? AWS - want to upload multiple files to S3 and only when all are uploaded trigger a lambda function How do I add a Lambda Function with an S3 Trigger in CloudFormation? Unable to trigger AWS Lambda by upload to AWS S3 How to trigger a Lambda on S3 file upload How do I ensure that a Lambda function writes to Cloudwatch Logs on every S3 Put Trigger after deploy? How to add s3 trigger to lambda function? Trigger lambda function in a certain time range from s3 upload How do I specify an existing Lambda function's alias as a DynamoDB trigger using the AWS CDK? How to upload excel file to AWS S3 using an AWS lambda function in python

Related Tags

How do I trigger a AWS lambda function only if bulk upload finished on S3?

Question

1 answers

solution1 1 2022-07-07 01:24:52

solution1
1 2022-07-07 01:24:52