简体   繁体   English

Kafka将主题存储在多节点群集中的何处?

[英]Where do Kafka stores the topic in a multi node cluster?

I have a 3 node Kafka cluster and I am creating a topic in one of the node with the below command: bin/kafka-create-topic.sh --zookeeper host1.com:2181,host2.com:2181,host3.com:2181 --replica 1 --partition 1 --topic test 我有一个3节点的Kafka群集,并且正在使用以下命令在其中一个节点中创建一个主题: bin / kafka-create-topic.sh --zookeeper host1.com:2181,host2.com:2181,host3.com :2181-副本1-分区1-主题测试

So,now when I push messages to the topic,one of my host is getting overloaded with the topic messages as Kafka stores the messages in disk space. 因此,现在当我将消息推送到主题时,由于Kafka将消息存储在磁盘空间中,因此我的主机中的一个正变得超载主题消息。 I want to know if there is any configuration to set to distribute the storing process across the cluster. 我想知道是否需要设置任何配置以在整个群集中分配存储过程。

Thanks, 谢谢,

As @om-nom-nom points out, you are creating a topic with a single partition. 正如@ om-nom-nom所指出的,您正在创建具有单个分区的主题。 So that topic will only ever be on the node that you created it on. 因此,该主题将永远只在创建该主题的节点上。 So even though you have a 3 node setup, the other two nodes will never be used. 因此,即使您设置了3个节点,也将永远不会使用其他两个节点。

Changing your topic to use multiple partitions is how you distribute a Kafka topic. 更改主题以使用多个分区是分发Kafka主题的方式。 The Kafka broker doesn't distribute messages to different nodes. Kafka代理不会将消息分发到其他节点。 It's the producers responsibility to determine which partition a message goes to. 确定消息转到哪个分区是生产者的责任。 This is something you can you determine, or let the producer use a round-robin approach to distribute to partitions, as @om-nom-nom points out. 正如@ om-nom-nom所指出的,您可以确定这一点,或者让生产者使用循环方法来分发到分区。

In Kafka producer, a partition key can be specified to indicate the destination partition of the message. 在Kafka生产者中,可以指定分区键以指示消息的目标分区。 By default, a hashing-based partitioner is used to determine the partition id given the key, and people can use customized partitioners also. 默认情况下,基于散列的分区器用于确定给定键的分区ID,人们也可以使用自定义分区器。

To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017 ), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one. 为了减少打开套接字的数量,在0.8.0( https://issues.apache.org/jira/browse/KAFKA-1017 )中,当未指定分区键或为null时,生产者将选择一个随机分区并粘贴切换至另一时间之前(默认为10分钟)。
source 资源

主题可以切成多个分区(您的配置仅使用1),默认情况下,这些主题将以循环方式在代理之间分配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Apache Kafka多节点集群中连接到Zookeeper - Connecting to Zookeeper in a Apache Kafka Multi Node cluster 一个节点关闭时,多节点 multibroker kafka 集群不工作 - Multi node multibroker kafka cluster is not working while one node is shutdown Micronaut Kafka Listener 多主题 - Micronaut Kafka Listener multi topic Kafka 流聚合 function 是否将数据同时存储在 ram 和主题中? - Is Kafka streams aggregation function stores data both in ram and topic? kafka 流状态存储保存在哪里? - Where are kafka streams state stores saved? 为什么在 Kafka 集群未运行的情况下创建主题时 AdminClient 不会失败? - Why is AdminClient not failing when creating a topic with Kafka cluster not running? 单节点集群中的 h2o 是进行并行处理还是仅在多节点集群中进行并行处理? - Does h2o in a single node cluster do parallel processing or is it only in multi node cluster that parallel processing kicks in? 如果第一个集群使用 java 出现故障,如何将消息推送到第二个 kafka 集群主题 - How to push message to second kafka cluster topic if first cluster goes down using java 使用Java连接到Docker上的多节点Couchbase集群 - Connect to a multi node Couchbase cluster on Docker in Java 在多节点集群中运行Hadoop无法正常工作 - Running Hadoop in multi-node cluster not working
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM