I'm playing with kafka in a multi node environment to test how failover works. Actually, i have 2 VMs with 1 kafka node inside each VM, and only 1 zookeeper inside one of the two VMs. I know that there is not an optimal production configuration, but it's just to train myself and understand things better.
Here is my configuration: VM1 ip: 192.168.64.2 (With only one broker with broker.id=2) VM2 ip: 192.168.64.3 (With zookeeper running here and broker with broker.id=1)
I start kafka through podman (this is not a problem with podman, everything is well configured)
On VM1:
podman run -e KAFKA_BROKER_ID=2 -e KAFKA_ZOOKEEPER_CONNECT=192.168.64.3:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9093,PLAINTEXT_HOST://192.168.64.2:29092 -e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2 -e KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT_HOST -e UNCLEAN_LEADER_ELECTION_ENABLE=true --pod zookeeper-kafka confluentinc/cp-kafka:latest
On VM2:
podman run -e KAFKA_BROKER_ID=1 -e KAFKA_ZOOKEEPER_CONNECT=localhost:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092,PLAINTEXT_HOST://192.168.64.3:29092 -e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2 -e KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT_HOST -e UNCLEAN_LEADER_ELECTION_ENABLE=true --pod zookeeper-kafka confluentinc/cp-kafka:latest
Now i create a topic "orders":
./kafktopics --create --bootstrap-server 192.168.64.2:29092,192.168.64.3:29092 --replication-factor 2 --partitions 2 --topic orders
Then i create a producer:
./kafkconsole-producer --broker-list 192.168.64.2:29092,192.168.64.3:29092 --topic orders
And a consumer:
./kafka-console-consumer --bootstrap-server 192.168.64.2:29092,192.168.64.3:29092 --topic orders```
Here is what i am try to do:
ERROR Error when sending message to topic orders with key: null, value: 9 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for orders-0:120000 ms has passed since batch creation
I really don't understand what is happening here? It works well at the beginning, but after start/stop/start broker, it start to fail. I need to precise that i never kill the 2 broker at the same time.
Could you please explain me what i am missing here?
Thank you all:)
EDIT
To complete comments below:
@OneCricketeer, I put the answer of your comment here.
At startup, when all it's fine:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2,1
Topic: orders Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1,2
After killing VM2:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2
Topic: orders Partition: 1 Leader: 2 Replicas: 1,2 Isr: 2
After killing VM1:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 1 Replicas: 2,1 Isr: 1
Topic: orders Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1
After killing VM2:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2
Topic: orders Partition: 1 Leader: 2 Replicas: 1,2 Isr: 2
(From here, the producer can't publish message anymore)
I put the answer of your comment here.
At startup, when all it's fine:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2,1
Topic: orders Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1,2
After killing VM2:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2
Topic: orders Partition: 1 Leader: 2 Replicas: 1,2 Isr: 2
After killing VM1:
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 1 Replicas: 2,1 Isr: 1
Topic: orders Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1
After killing VM2: (From here, the producer can't publish message anymore)
Topic: orders TopicId: I3hMNln9TpSuo76xHSpMXQ PartitionCount: 2 ReplicationFactor: 2 Configs:
Topic: orders Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2
Topic: orders Partition: 1 Leader: 2 Replicas: 1,2 Isr: 2
After a long time of reading and investigating things about kafka, i finally found the answer of my problem.
With only 2 broker, i need the following configuration
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2
KAFKA_OFFSETS_TOPIC_NUM_PARTITIONS=1
The problem was the default number of partitions for the offset topics. (it was 49 of 50 if i remember well).
Now with only one partition and 2 replicas, everything works well and i can start/stop/start/stop/.... my brokers as many time as i want, and the other broker take the lead and continue to handle my messages.
Hope that could help someone in the future.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.