简体   繁体   中英

Stream based lambda concurrent execution

I have a python lambda function that is triggered when there is an INSERT or an UPDATE in the Dynamodb. As we know, it is a stream-based invocation. If there are 1000 records that are inserted into the Dynamodb, and I gave batch size as 1. My problem is the lambda is running each record one after the other. How do I change it to run all 1000 records in parallel(Concurrent executions)? should i import any additional python modules like "from concurrent.futures import ThreadPoolExecutor" other than what i am using for my work? (My code is really big to post here)

You cannot control the parallelism of the processing of the stream.

AWS DynamoDB streams will separate the DynamoDB table rows into shards. Each shard will be processed serially (one batch at a time). This is to ensure "in order" processing of rows.

However, if your table has heavy writes, then DynamoDB streams may split and re-split shards into smaller parent and child shards. The shards can be processed in parallel.

See http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html

The best control you can do is increase the "batch size" up from 1. If you are confident that your database updates can be processed concurrently, then you can concurrently process the multiple rows you'll receive with a batch size > 1 inside your Lambda function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM