[英]Where do Kafka stores the topic in a multi node cluster?
I have a 3 node Kafka cluster and I am creating a topic in one of the node with the below command: bin/kafka-create-topic.sh --zookeeper host1.com:2181,host2.com:2181,host3.com:2181 --replica 1 --partition 1 --topic test 我有一个3节点的Kafka群集,并且正在使用以下命令在其中一个节点中创建一个主题: bin / kafka-create-topic.sh --zookeeper host1.com:2181,host2.com:2181,host3.com :2181-副本1-分区1-主题测试
So,now when I push messages to the topic,one of my host is getting overloaded with the topic messages as Kafka stores the messages in disk space. 因此,现在当我将消息推送到主题时,由于Kafka将消息存储在磁盘空间中,因此我的主机中的一个正变得超载主题消息。 I want to know if there is any configuration to set to distribute the storing process across the cluster.
我想知道是否需要设置任何配置以在整个群集中分配存储过程。
Thanks, 谢谢,
As @om-nom-nom points out, you are creating a topic with a single partition. 正如@ om-nom-nom所指出的,您正在创建具有单个分区的主题。 So that topic will only ever be on the node that you created it on.
因此,该主题将永远只在创建该主题的节点上。 So even though you have a 3 node setup, the other two nodes will never be used.
因此,即使您设置了3个节点,也将永远不会使用其他两个节点。
Changing your topic to use multiple partitions is how you distribute a Kafka topic. 更改主题以使用多个分区是分发Kafka主题的方式。 The Kafka broker doesn't distribute messages to different nodes.
Kafka代理不会将消息分发到其他节点。 It's the producers responsibility to determine which partition a message goes to.
确定消息转到哪个分区是生产者的责任。 This is something you can you determine, or let the producer use a round-robin approach to distribute to partitions, as @om-nom-nom points out.
正如@ om-nom-nom所指出的,您可以确定这一点,或者让生产者使用循环方法来分发到分区。
In Kafka producer, a partition key can be specified to indicate the destination partition of the message.
在Kafka生产者中,可以指定分区键以指示消息的目标分区。 By default, a hashing-based partitioner is used to determine the partition id given the key, and people can use customized partitioners also.
默认情况下,基于散列的分区器用于确定给定键的分区ID,人们也可以使用自定义分区器。
To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017 ), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one.
为了减少打开套接字的数量,在0.8.0( https://issues.apache.org/jira/browse/KAFKA-1017 )中,当未指定分区键或为null时,生产者将选择一个随机分区并粘贴切换至另一时间之前(默认为10分钟)。
source资源
主题可以切成多个分区(您的配置仅使用1),默认情况下,这些主题将以循环方式在代理之间分配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.