简体   繁体   中英

What happens if the number of workers is > number of shards when using KCL with AWS Kinesis streams?

The AWS Kinesis stream documentation mentions

Typically, when you use the KCL, you should ensure that the number of instances does not exceed the number of shards

What would be the consequence if the number of instances exceeds the number of shards? I plan on running one worker per Web server (separate thread). So I want to know whether it is required to check and compare the number of shards and running workers when a new web server instance is started. Or can one just start another worker without any side effect if the number of workers exceeds the number of shards.

TL; DR: There can only be one Worker per Shard. Any additional Workers will sit idle.

If you have a Kinesis stream with two shards, and you run an app on a single instance that leverages the KCL, the app will run two workers in separate threads-- one Worker per Shard (per thread).

If you run two instances, your app will run a single Worker on each instance in a thread-- two instances, one worker each; one Kinesis stream, two shards.

Each worker takes out a lease against a shard in a stream so no other worker of the same app can read the same shard. The Worker stores the lease information in Dynamo DB so other Workers can read it.

If you were to run 3 instances in this scenario, one of the instances would sit around waiting for a Worker on one of the other instances to lose its lease. Once one of the other Workers loses its lease, the third Worker could pick up the stream and begin processing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM