简体   繁体   中英

AWS Kinesis + Lambda Function: can I invoke multiple instances of one Lambda Function concurrently in one shard?

I have one AWS Lambda Function consuming Kinesis Data Streams events. Per the document there can be multiple Lambda Function instances running in parallel, each instance process one shard. Will that happen that multiple Lambda Function instances processing on same shard at same time? In the sample, batch size is 2, one instance is processing e_4 and e_5, while at the same time another instance is processing e_2 and e_3, can this happen? IMO this won't happen, only after the invoke for e_4 and e_5 completes, another invoke can happen to process e_2 and e_3, is my understanding correct?

在此处输入图片说明

Yes. You are right. Kinesis - Lambda trigger supports only one lambda per shard.

Moreover it's not an AWS limitation it's pretty common for ordered streams like kinesis or Kafka. There are couple things to enforce this limitation:

  • in case of failure you should reset offset to the failed message. if lambda has started processing messages behind failed one you will have redelivered message (or have to skip failed one)

  • very often the key (hash) are chosen to have an guarantee of ordered processing. Like orderid is the key and stream has actions on order. All actions should be performed in the correct order.

No this is will not happen. Shards are concurrency units for lambda integration. The number of concurrent executions of a single lambda function is bounded from the top by the number of shards, meaning that if you have 10 shards then at most 10 instances of a single lambda function can process the stream those shards belong to.

Your Lambda function is a consumer application for your data stream. It processes one batch of records at a time from each shard.

To increase the speed at which your function processes records, add shards to your data stream.

It might be less if you don't reserve concurrency capacity for the lambda function and it is being used up elsewhere.

If your function can't scale up to handle one concurrent execution per shard, request a limit increase or reserve concurrency for your function. The concurrency available to your function should match or exceed the number of shards in your Kinesis data stream.

Note that you may have different functions processing the same stream at the same time, the above restriction holds only in the context of single lambda function.

You can create multiple event source mappings to process the same data with multiple Lambda functions, or process items from multiple data streams with a single function.

quotes taken from here

If for whatever reason you need to process the same shard at the same time by the same code then you need to create 2 lambda functions containing the same code and integrate both of them with the stream but note that they will not be aware of each other's state of processing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM