简体   繁体   中英

How failover works in kafka along with keeping up replication factor

I am trying to understand how failover and replication factors work in kafka.

Let's say my cluster has 3 brokers and replication factor is also 3. In this case each broker will have one copy of partition and one of the broker is leader. If leader broker fails, then one of the follower broker will become leader but now the replication factor is down to 2. At this point if I add a new broker in the cluster, will kafka make sure that replication factor is 3 and will it copy the required data on the new broker.

How will above scenario work if my cluster already has an addition broker.

Leaving out partitions (which is another concept of Kafka):

The replication factor does not say how many times a topic is replicated, but rather how many times it should be replicated. It is not affected by brokers shutting down.

Once a leader broker shuts down, the "leader" status goes over to another broker which is in sync , that means a broker that has the current state replicated and is not behind. Electing "leader" status to a broker that is not in sync would obviously lead to data loss, so this will never happen (when using the right settings).

These replicas eligible for taking "leader status" are called in-sync replica (ISR) , which is important, as there is a configuration called min.insync.replicas that specifies how many ISR have to exist for a Kafka message to be acknowledged. If this is set to 0, every Kafka message is acknowledged as "successful" as soon as it enters the "leader" broker, if this broker would die, all data that was not replicated yet is lost. If min.insync.replicas would be set to 1, every message waits with the acknowledgement, until at least 1 replica exists in order to be "successful", so if the broker would die now, there would be a replica covering this data. If there are not enough brokers to cover the minimum amount of replicas, your cluster will fail eventually.

So to answer your question: if you had 2 running brokers, min.insync.replicas=1 (default) and replication factor of 3, your cluster runs fine and will add a replica as soon as you start up another broker. If another of the 2 brokers dies before you launch the third one, you will run into problems.

In your setup (3 broker, 3 replicas), when 1 broker fails Kafka will automatically elect new leaders (on the remaining brokers) for all the partitions whose leaders were on the failing broker.

The replication factor does not change. The replication factor is a topic configuration that can only be changed by the user.

Similarly the Replica list does not change. This lists the brokers that should host each partition.

However, the In Sync Replicas (ISR) list will change and only contain the 2 remaining brokers.

If you add another broker to the cluster, what happens depend on its broker.id :

  • if the broker.id is the same as the broker that failed, this new broker will start replicating data and eventually join the ISR for all the existing partitions.
  • if it uses a different broker.id , nothing will happen. You will be able to create new topics with 3 replicas (that is not possible while there are only 2 brokers) but Kafka will not automatically replicate existing partitions. You can manually trigger a reassignment if needed, see the docs .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM