简体   繁体   English

Kubernetes上的Kafka-UNKNOWN_TOPIC_OR_PARTITION和LEADER_NOT_AVAILABLE错误

[英]Kafka on Kubernetes - UNKNOWN_TOPIC_OR_PARTITION and LEADER_NOT_AVAILABLE error

This is a follow-up question on this . 这是一个后续问题 I have managed to do the following: 我设法做到以下几点:

  1. Create a headless service for my 5 broker Kafka cluster for inter-broker communication 为我的5个经纪人Kafka集群创建无头服务,实现经纪人之间的交流
  2. Set up one service for each broker 为每个经纪人设置一项服务
    1. each service has an external ip 每个服务都有一个外部IP
    2. only one pod is selected for each service, eg service "kafka-0-es" selects the pod "kafka-0" 每个服务仅选择一个Pod,例如,服务“ kafka-0-es”选择pod“ kafka-0”
  3. The pods advertise their respective external ip correctly. 这些Pod正确地通告了它们各自的外部IP。 I verified this by accessing the data on the ZooKeeper CLI. 我通过访问ZooKeeper CLI上的数据验证了这一点。

I created a topic test-topic with zkCli and verified it has been created. 我使用zkCli创建了一个主题test-topic ,并验证了它是否已创建。 After that, I started the Kafka console producer. 之后,我开始了Kafka控制台制作人。

.\kafka-console-producer.bat --broker-list EXTERNAL_IP_1:9093,EXTERNAL_IP_2:9093,EXTERNAL_IP_3:9093,EXTERNAL_IP_4:9093,EXTERNAL_IP_5:9093 --topic test-topic --property parse.key=true --property key.
separator=:
>afkjdshasdkfjhsdkjsf:128379127893123
>[2018-05-09 17:35:51,622] WARN [Producer clientId=console-producer] Got error produce response with correlation id 9 on topic-partition test-topic-0, retrying (2 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,623] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,649] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 10 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,720] WARN [Producer clientId=console-producer] Got error produce response with correlation id 11 on topic-partition test-topic-0, retrying (1 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,720] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,773] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 12 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,823] WARN [Producer clientId=console-producer] Got error produce response with correlation id 13 on topic-partition test-topic-0, retrying (0 attempts left). Error: UNKNOWN_TOPIC_OR_PARTITION (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,823] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:51,913] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 14 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:51,936] ERROR Error when sending message to topic test-topic with key: 20 bytes, value: 15 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.
[2018-05-09 17:35:51,945] WARN [Producer clientId=console-producer] Received unknown topic or partition error in produce request on partition test-topic-0. The topic/partition may not exist or the user may not have Describe access to it (org.apache.kafka.clients.producer.internals.Sender)
[2018-05-09 17:35:52,034] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 16 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:35:52,161] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 20 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-09 17:40:52,288] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 25 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

My Kafka broker "kafka-2" is the leader of this topic, according to Zookeeper: 根据Zookeeper的说法,我的Kafka经纪人“ kafka-2”是该主题的领导者:

get /kafka/brokers/topics/test-topic/partitions/0/state

{"controller_epoch":5,"leader":2,"version":1,"leader_epoch":0,"isr":[2,1]} 

But the pod kafka-2 is throwing errors in the Log 但是pod kafka-2在Log中抛出错误

[2018-05-09 15:21:02,524] ERROR [ReplicaFetcherThread-0-2], Error for partition [test-topic,0] to broker 2:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread)

Not quite sure why this is happening, the configuration looks fine to me. 我不太确定为什么会这样,对我来说配置看起来不错。 Is there something more I am missing to get my Kafka cluster running on Kubernetes? 使我的Kafka集群在Kubernetes上运行还缺少什么呢?

Note that I have also tried to completely wipe my cluster (scale down kafka cluster, delete kafka storage, scale down zk cluster, delete zk storage, scale up zk, scale up kafka) but to no avail. 请注意,我还尝试完全擦除群集(缩小kafka群集,删除kafka存储,缩小zk群集,删除zk存储,放大zk,放大kafka),但无济于事。

I have fixed it just now. 我已经修复了。 The problem was that my headless service contained both the internal as well as the external port. 问题是我的无头服务既包含内部端口,也包含外部端口。

Now, my headless service does only contain the internal port: 现在,我的无头服务仅包含内部端口:

apiVersion: v1
kind: Service
metadata:
  name: kafka-hs
  labels:
    app: kafka
spec:
  ports:
  - port: 29092
    name: server
  clusterIP: None
  selector:
    app: kafka

And my per-pod-services that expose the external ip contain the external port (note that an RedHat OpenShift script handles the allocation of external ips to these services, this is not covered in the service definition): 我的公开外部ip的按服务显示的服务包含外部端口(请注意,RedHat OpenShift脚本处理这些服务的外部ip分配,服务定义中未涉及):

apiVersion: v1
kind: Service
metadata:
  name: kafka-es-4
  labels:
    app: kafka
  namespace: whatever
spec:
  ports:
  - port: 9093
    name: kafka-port
    protocol: TCP
  selector:
    statefulset.kubernetes.io/pod-name: kafka-4
    app: kafka
  type: LoadBalancer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM