简体   繁体   中英

Where do Kafka stores the topic in a multi node cluster?

I have a 3 node Kafka cluster and I am creating a topic in one of the node with the below command: bin/kafka-create-topic.sh --zookeeper host1.com:2181,host2.com:2181,host3.com:2181 --replica 1 --partition 1 --topic test

So,now when I push messages to the topic,one of my host is getting overloaded with the topic messages as Kafka stores the messages in disk space. I want to know if there is any configuration to set to distribute the storing process across the cluster.

Thanks,

As @om-nom-nom points out, you are creating a topic with a single partition. So that topic will only ever be on the node that you created it on. So even though you have a 3 node setup, the other two nodes will never be used.

Changing your topic to use multiple partitions is how you distribute a Kafka topic. The Kafka broker doesn't distribute messages to different nodes. It's the producers responsibility to determine which partition a message goes to. This is something you can you determine, or let the producer use a round-robin approach to distribute to partitions, as @om-nom-nom points out.

In Kafka producer, a partition key can be specified to indicate the destination partition of the message. By default, a hashing-based partitioner is used to determine the partition id given the key, and people can use customized partitioners also.

To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017 ), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one.
source

主题可以切成多个分区(您的配置仅使用1),默认情况下,这些主题将以循环方式在代理之间分配。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM