简体   繁体   中英

Cannot read data from Kafka partition on exact listener port

I have Kafka cluster of 3 nodes. I am using kafkacat to list data from Kafka. I configure PLAINTEXT and VPN_PLAINTEXT listeners:

listeners=PLAINTEXT://0.0.0.0:6667,VPN_PLAINTEXT://0.0.0.0:6669
advertised.listeners=PLAINTEXT://hadoop-kafka1-stg.local.company.cloud:6667,VPN_PLAINTEXT://hadoop-kafka1-stg-vip.local.company.cloud:6669
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,VPN_PLAINTEXT:PLAINTEXT

We find out, we cannot consume data from node 1 (only) - from topics where partition leader is node 1 with error:

kafkacat -C -b hadoop-kafka1-stg-vip.local.company.cloud:6669 -t <topic-name> -o beginning -e -q -p 11
% ERROR: Topic <topic-name> [11] error: Broker: Not leader for partition

I can see, node 1 is leader for this partition:

Metadata for <topic-name> (from broker 3: hadoop-kafka3-stg-vip.local.company.cloud:6669/3):
 3 brokers:
  broker 2 at hadoop-kafka2-stg-vip.local.company.cloud:6669
  broker 3 at hadoop-kafka3-stg-vip.local.company.cloud:6669 (controller)
  broker 1 at hadoop-kafka1-stg-vip.local.company.cloud:6669
 1 topics:
  topic "<topic-name>" with 12 partitions:
    partition 0, leader 2, replicas: 2,1,3, isrs: 3,2,1
    partition 1, leader 3, replicas: 3,2,1, isrs: 3,2,1
    partition 2, leader 1, replicas: 1,3,2, isrs: 3,2,1
    partition 3, leader 2, replicas: 2,3,1, isrs: 3,2,1
    partition 4, leader 3, replicas: 3,1,2, isrs: 3,2,1
    partition 5, leader 1, replicas: 1,2,3, isrs: 3,2,1
    partition 6, leader 2, replicas: 2,1,3, isrs: 3,2,1
    partition 7, leader 3, replicas: 3,2,1, isrs: 3,2,1
    partition 8, leader 1, replicas: 1,3,2, isrs: 3,2,1
    partition 9, leader 2, replicas: 2,3,1, isrs: 3,2,1
    partition 10, leader 3, replicas: 3,1,2, isrs: 3,2,1
    partition 11, leader 1, replicas: 1,2,3, isrs: 3,2,1

I thought the data on node could be corrupted, so I remove everything from data directory kafka_data_dir for Kafka. When I start the daemon, I could see it syncing. After that, the issue persists. There is nothing suspicious in logs.

Could anybody describ and help to find out where is the root cause? Only node number 1 encounter this issue. When I ask the same node on port 6667 , it works smoothly.

After deeper investigation of traffic with tcpdump I find out that the Kafka configuration was without any problem. When I asked node1 for topic partition, tcpdump on node1 did not catch any packets. Requests has been sent to node3 . Requests should be forwarded based on DNS to the right Kafka nodes over Citrix, but the configuration was wrong:

  • hadoop-kafka1-stg-vip.local.company.cloud -> node 3
  • hadoop-kafka2-stg-vip.local.company.cloud -> node 2
  • hadoop-kafka3-stg-vip.local.company.cloud -> node 3

That's the reason, why requests for partition where node1 is not leader works, and when asked for partition where node1 was leader failed with message Broker: Not leader for partition because it was always forwared to node3 by Citrix.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM