简体   繁体   English

kafka 分区可以分布在多个 kafka 集群节点上吗?

[英]Can kafka partitions be spread across multiple kafka cluster nodes?

My application has a list of kafka cluster nodes specified in the spring.kafka.bootstrap-servers property and listens to topics on all these nodes.我的应用程序有一个在 spring.kafka.bootstrap-servers 属性中指定的 kafka 集群节点列表,并监听所有这些节点上的主题。

If I were to create a topic on one of these nodes, with lets say 5 partitions, will these partitions be spread across these multiple nodes or will they be created on a single node?如果我要在其中一个节点上创建一个主题,假设有 5 个分区,这些分区会分布在这些多个节点上,还是会在单个节点上创建? Also, how can I find out which node a topic partition actually exists on?另外,如何找出主题分区实际存在于哪个节点上?

You don't actually create topics in one specific node in a Kakfa cluster.您实际上并没有在 Kakfa 集群的一个特定节点中创建主题。 When you issue a request to create a topic, the partitions will automatically be spread out across all nodes belonging to the cluster, and the replicas will also be spread out.当您发出创建主题的请求时,分区将自动分布在属于集群的所有节点上,并且副本也将分布。 That is how Kafka handles high-availability.这就是 Kafka 处理高可用性的方式。 If one of the nodes is down, some other node has all the required data, so there is no downtime or impact to users of the cluster.如果其中一个节点宕机,其他节点拥有所有需要的数据,因此不会出现宕机时间或对集群用户造成影响。

You can issue a --describe command like this:您可以像这样发出--describe命令:

> bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic my-replicated-topic

    Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
        Topic: my-replicated-topic  Partition: 0    Leader: 1   Replicas: 1,2,0 Isr: 1,2,0

That will give you a list of the partitions for your topic, where are they located, which node is the leader for that partition (the one consumers are told to consume from when they need data from that partition), and some more info like the In-Sync Replica status, or ISR, and the replication factor.这将为您提供主题的分区列表,它们位于何处,哪个节点是该分区的领导者(当消费者需要来自该分区的数据时,消费者被告知从该节点开始消费),以及更多信息,例如同步副本状态或 ISR 和复制因子。

There's more info at the official Kafka docs here and here . 此处此处的官方 Kafka 文档有更多信息。

Bear in mind that when your client connects to the bootstrap-server it is not specifying a complete list of brokers from which to read data .请记住,当您的客户端连接到bootstrap-server时,它并没有指定从中读取数据的完整代理列表。 It's just specifying one (or more) brokers from which to pull information about the cluster .它只是指定一个(或多个)代理,从中提取有关集群的信息 When the client reads/writes from a given topic and partition that is done directly to the relevant broker that holds that data ( regardless of the particular brokers specified in the bootstrap).当客户端从给定的主题和分区读取/写入时,直接对保存该数据的相关代理完成(无论引导程序中指定的特定代理如何)。 You can see more about this process here and here .您可以在此处此处查看有关此过程的更多信息。

Like the other answer said, a topic is not owned by or created for a particular node, it is created for the cluster altogether.就像另一个答案所说的那样,主题不是由特定节点拥有或创建的,而是完全为集群创建的。 Whenever a topic is created, the partitions are divided among the cluster nodes.每当创建主题时,分区就会在集群节点之间进行划分。 Each partition has a leader node and replica nodes.每个分区都有一个领导节点和副本节点。 Producers write to the leader node and Kafka internally replicates the data on the replica nodes.生产者写入领导节点,Kafka 在内部复制副本节点上的数据。 Consumers consume data of a partition from its leader node.消费者从其领导节点消费一个分区的数据。

For a better understanding/visualisation of topic partition distribution in Kafka, you can use tools like Kafdrop You can follow the steps in readme section of the repo for setup.为了更好地理解/可视化 Kafka 中的主题分区分布,您可以使用Kafdrop之类的工具。您可以按照 repo 的自述文件部分中的步骤进行设置。 You can download the latest binary from here .您可以从这里下载最新的二进制文件。 In the UI, you can see the leader and replica nodes for each partition of a topic.在 UI 中,您可以看到主题的每个分区的领导者和副本节点。

The setup is pretty straightforward and I personally find the tool VERY useful!设置非常简单,我个人认为该工具非常有用!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM