How to handle kafka leader failover with two nodes for a topic with 2 partitions and 2 replicas?

Question

I'm playing with kafka in a multi node environment to test how failover works. Actually, i have 2 VMs with 1 kafka node inside each VM, and only 1 zookeeper inside one of the two VMs. I know that there is not an optimal production configuration, but it's just to train myself and understand things better.

Here is my configuration: VM1 ip: 192.168.64.2 (With only one broker with broker.id=2) VM2 ip: 192.168.64.3 (With zookeeper running here and broker with broker.id=1)

I start kafka through podman (this is not a problem with podman, everything is well configured)

On VM1:

podman run -e KAFKA_BROKER_ID=2 -e KAFKA_ZOOKEEPER_CONNECT=192.168.64.3:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9093,PLAINTEXT_HOST://192.168.64.2:29092 -e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2 -e KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT_HOST -e UNCLEAN_LEADER_ELECTION_ENABLE=true --pod zookeeper-kafka confluentinc/cp-kafka:latest

On VM2:

podman run -e KAFKA_BROKER_ID=1 -e KAFKA_ZOOKEEPER_CONNECT=localhost:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092,PLAINTEXT_HOST://192.168.64.3:29092 -e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2 -e KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT_HOST -e UNCLEAN_LEADER_ELECTION_ENABLE=true  --pod zookeeper-kafka confluentinc/cp-kafka:latest

Now i create a topic "orders":

./kafktopics --create --bootstrap-server 192.168.64.2:29092,192.168.64.3:29092 --replication-factor 2 --partitions 2 --topic orders

Then i create a producer:

./kafkconsole-producer --broker-list 192.168.64.2:29092,192.168.64.3:29092 --topic orders

And a consumer:

./kafka-console-consumer --bootstrap-server 192.168.64.2:29092,192.168.64.3:29092 --topic orders```

Here is what i am try to do:

Start Zookeeper, the 2 kafka nodes, create the "order" topic, the producer and the consumer (OK, everything works well)
Send message in my producer and check the consumer receive it (OK)
Kill the kafka node on VM2 (OK)
Send again a message in my producer and check the consumer receive it (OK, the broker on VM1 can distribute the message)
Restart the killed kafka node on VM2 (OK. After that i can see that the 2 partitions have VM1 as the leader)
Send again a message in my producer and check the consumer receive id (OK)
Kill the kafka node on VM1, which is the leader of the 2 partitions now (OK)
Send again a message in my producer and check the consumer receive it (OK, the broker on VM2 can distribute the message)
Restart the killed kafka node on VM1 (OK. After that i can see that the 2 partitions have VM2 as the leader)
Send again a message in my producer and check the consumer receive it (OK)
Kill again the kafka node on VM2 (OK)
Send again a message in my producer and check the consumer receive it (NOT OK): Here, the producer can't send the message, and my consumer never receive the message, After a few period: i get an error in my producer :

ERROR Error when sending message to topic orders with key: null, value: 9 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for orders-0:120000 ms has passed since batch creation

I really don't understand what is happening here? It works well at the beginning, but after start/stop/start broker, it start to fail. I need to precise that i never kill the 2 broker at the same time.

Could you please explain me what i am missing here?

Thank you all:)

EDIT

To complete comments below:

@OneCricketeer, I put the answer of your comment here.

At startup, when all it's fine:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2,1
    Topic: orders   Partition: 1    Leader: 1   Replicas: 1,2   Isr: 1,2

After killing VM2:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2
    Topic: orders   Partition: 1    Leader: 2   Replicas: 1,2   Isr: 2

After killing VM1:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 1   Replicas: 2,1   Isr: 1
    Topic: orders   Partition: 1    Leader: 1   Replicas: 1,2   Isr: 1

After killing VM2:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2
    Topic: orders   Partition: 1    Leader: 2   Replicas: 1,2   Isr: 2

(From here, the producer can't publish message anymore)

Answer 1

I put the answer of your comment here.

At startup, when all it's fine:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2,1
    Topic: orders   Partition: 1    Leader: 1   Replicas: 1,2   Isr: 1,2

After killing VM2:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2
    Topic: orders   Partition: 1    Leader: 2   Replicas: 1,2   Isr: 2

After killing VM1:

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 1   Replicas: 2,1   Isr: 1
    Topic: orders   Partition: 1    Leader: 1   Replicas: 1,2   Isr: 1

After killing VM2: (From here, the producer can't publish message anymore)

Topic: orders   TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: orders   Partition: 0    Leader: 2   Replicas: 2,1   Isr: 2
    Topic: orders   Partition: 1    Leader: 2   Replicas: 1,2   Isr: 2

Answer 2

After a long time of reading and investigating things about kafka, i finally found the answer of my problem.

With only 2 broker, i need the following configuration

KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2
KAFKA_OFFSETS_TOPIC_NUM_PARTITIONS=1

The problem was the default number of partitions for the offset topics. (it was 49 of 50 if i remember well).

Now with only one partition and 2 replicas, everything works well and i can start/stop/start/stop/.... my brokers as many time as i want, and the other broker take the lead and continue to handle my messages.

Hope that could help someone in the future.

How to handle kafka leader failover with two nodes for a topic with 2 partitions and 2 replicas?

Question

1 answers

solution1
0 2021-12-19 16:02:03

solution2
0 ACCPTED 2021-12-23 16:36:14

How to handle kafka leader failover with two nodes for a topic with 2 partitions and 2 replicas?

Question

1 answers

solution1 0 2021-12-19 16:02:03

solution2 0 ACCPTED 2021-12-23 16:36:14

solution1
0 2021-12-19 16:02:03

solution2
0 ACCPTED 2021-12-23 16:36:14