简体   繁体   English

Kafka Topic、Broker、ZooKeeper 架构概览

[英]Kafka Topic, Broker, ZooKeeper architecture overview

I have read a bunch of articles regarding Kafka architecture but I'm still brand-new in this and when it came to coding there was some confusion if I get the things correctly.我已经阅读了很多关于 Kafka 架构的文章,但我在这方面仍然是全新的,在编码方面,如果我正确理解的话,会有一些困惑。

From what I understand Kafka server, broker and node are synonyms.据我了解,Kafka 服务器、代理和节点是同义词。 There can be a few brokers within Kafka cluster. Kafka 集群中可以有几个代理。 There is a Kafka topic (T1) and it consists of a few partitions (P1, P2..).有一个 Kafka 主题(T1),它由几个分区(P1、P2 ..)组成。 These partitions can be replicated across the brokers (B1, B2..).这些分区可以跨代理(B1、B2 ..)进行复制。 B1 can be leader for P1, B2 for P2 and so on. B1 可以是 P1 的领导者,B2 是 P2 的领导者,依此类推。 Do we say that there is topic T1 defined for broker or cluster, and if we treat topic as set of partitions can we say 'topic replicas'?我们是否说为代理或集群定义了主题 T1,如果我们将主题视为一组分区,我们可以说是“主题副本”吗?

From the official Kafka documentation:来自 Kafka 官方文档:

bootstrap.servers: A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. bootstrap.servers:用于建立与 Kafka 集群的初始连接的主机/端口对列表。 The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers.客户端将使用所有服务器,而不管此处指定哪些服务器用于引导——此列表仅影响用于发现完整服务器集的初始主机。 This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).此列表的格式应为 host1:port1,host2:port2,.... 由于这些服务器仅用于初始连接以发现完整的集群成员资格(可能会动态更改),因此此列表不需要包含完整集服务器(不过,您可能需要多个服务器,以防服务器宕机)。

So from what I understand, defining host1:port1,host2:port2 says that there are two brokers.所以据我了解,定义 host1:port1,host2:port2 表示有两个代理。

In this case, does ZooKeeper automatically distribute a message to a leader when executing bin/kafka-console-producer.sh --broker-list host1:port1,host2:port2 --topic test ?在这种情况下,ZooKeeper 在执行bin/kafka-console-producer.sh --broker-list host1:port1,host2:port2 --topic test是否会自动将消息分发给领导者? (I believe somewhere I have read that a producer should read broker id from ZooKeeper, but wouldn't it be unnecessary here?) Is it equal to publishing using bin/kafka-console-producer.sh --zookeeper host1:z_port1,host2:z_port2 --topic test ? (我相信我在某处读到过,生产者应该从 ZooKeeper 读取代理 ID,但在这里不是没有必要吗?)它是否等于使用bin/kafka-console-producer.sh --zookeeper host1:z_port1,host2:z_port2 --topic test How should I basically understand bin/kafka-configs.sh --zookeeper host1:z_port1,host2:z_port2 ?我应该如何理解bin/kafka-configs.sh --zookeeper host1:z_port1,host2:z_port2 We have only one zookeeper instance?我们只有一个zookeeper实例?

Do we say that there is topic T1 defined for broker or cluster, and if we treat topic as set of partitions can we say 'topic replicas'?我们是否说为代理或集群定义了主题 T1,如果我们将主题视为一组分区,我们可以说是“主题副本”吗?

1) Cluster. 1) 集群。 2) Partitions are individually replicated across multiple brokers, often more than the replication factor itself. 2) 分区在多个代理之间单独复制,通常比复制因子本身更多。 The more proper term would be the "in sync replicas (ISR)"更恰当的术语是“同步副本(ISR)”

does ZooKeeper automatically distribute a message to a leader when executing ZooKeeper 在执行时是否会自动将消息分发给领导者

Zookeeper does not, no. Zookeeper 没有,没有。 Your client communicates with a Broker Controller, then receives all brokers in the cluster, which also returns metadata about which broker is the leader for which topic-partitions.您的客户端与 Broker Controller 通信,然后接收集群中的所有 Broker,它还返回有关哪个 Broker 是哪个主题分区的领导者的元数据。 The client then individually connects and produces to each leader broker for the calculated partitions然后,客户端单独连接并为计算出的分区生成每个领导代理

Is it equal to publishing是否等于出版

Producing*, yes.生产*,是的。

We have only one zookeeper instance?我们只有一个zookeeper实例?

One Zookeeper cluster can manage multiple Kafka clusters via a feature called a chroot , the root directory in the Zookeeper znodes that contains information about the managed service.一个 Zookeeper集群可以通过称为chroot的功能管理多个 Kafka 集群, chroot是 Zookeeper znodes 中包含有关托管服务信息的根目录。

Also, kafka-topics command can now use --bootstrap-server , not --zookeeper此外, kafka-topics命令现在可以使用--bootstrap-server ,而不是--zookeeper

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM