简体   繁体   中英

Lambda timeout and can't process all results from DynamoDB. Any tips to optimize this other than moving to Fargate?

Lambda needs to get all results from DynamoDB and performs for processing on each record and trigger a step function workflow. Although paginated result is given by DynamoDB, Lambda will timeout if there are too many pages which can't be processed within 15 mins lambda limit. Is there any workaround to use lambda other than moving to Fargate?

Overview of Lambda

while True: 
   l, nextToken = get list of records from DynamoDB
   for each record in l:
      perform some preprocesing like reading a file and triggering a workflow
   if nextToken == None:
       break

I assume processing one record can fit inside the 15-minute lambda limit.

What you can do is to make your original lambda as an orchestrator that calls a worker lambda that processes a single page.

  • Orchestrator Lambda

     while True: l, nextToken = get list of records from DynamoDB for each record in l: call the worker lambda by passing the record as the event if nextToken == None: break
  • Worker Lambda

     perform some preprocesing like reading a file and triggering a workflow

You can use SQS to provide you with a method to process these in rapid succession. You can even use that to perform them more or less in parallel rather than in sync.

Lambda reads in Dynamodb -> breaks each entry into a json object -> sends the json object to SQS -> which queues them out to multiple invoked lambda -> that lambda is designed to handle one single entry end finish

Doing this allows you to split up long tasks that may take many many hours across multiple lambda invocations by designing the second lambda to only handle one iteration of the task - and using SQS as your loop/iterator. You can set settings on SQS to send as fast as possible or to send one at a time (though if you do the one at a time you will have to manage the time to live and staleness settings of the messages in the queue)

In addition, if this is a regular thing where new items get added to the dynamo that then have to be processed, you should make use of Dynamo Streams - everytime a new item is added that triggers a lambda to fire on that new item, allowing to to do your workflow in real time as items are added.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM