简体   繁体   English

为什么我的具有相同组ID的Kafka消费者没有得到平衡?

[英]Why my Kafka consumers with same group id are not being balanced?

I'm writing a proof of concept application to consume messages from Apache Kafka 0.9.0.0 and see if I can use it instead of a common JMS message broker because of the benefits Kafka provides. 我正在编写一个概念证明应用程序,以使用来自Apache Kafka 0.9.0.0的消息,并查看由于Kafka提供的好处,我是否可以代替普通的JMS消息代理来使用它。 This is my base code, using the new consumer API: 这是我的基本代码,使用新的使用者API:

public class Main implements Runnable {

    public static final long DEFAULT_POLL_TIME = 300;
    public static final String DEFAULT_GROUP_ID = "ltmjTest";

    volatile boolean keepRunning = true;
    private KafkaConsumer<String, Object> consumer;
    private String servers;
    private String groupId = DEFAULT_GROUP_ID;
    private long pollTime = DEFAULT_POLL_TIME;
    private String[] topics;

    public Main() {
    }

    //getters and setters...

    public void createConsumer() {
        Map<String, Object> configs = new HashMap<>();
        configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, servers);
        configs.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);

        configs.put("enable.auto.commit", "true");
        configs.put("auto.commit.interval.ms", "1000");
        configs.put("session.timeout.ms", "30000");

        configs.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        configs.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        consumer = new KafkaConsumer<>(configs);
        consumer.subscribe(asList(topics));
    }

    public static void main(String[] args) {
        Main main = new Main();
        if (args != null && args.length > 0) {
            for (String arg : args) {
                String[] realArg = arg.trim().split("=", 2);
                String argKey = realArg[0].toLowerCase();
                String argValue = realArg[1];
                switch (argKey) {
                case "polltime":
                    main.setPollTime(Long.parseLong(argValue));
                    break;
                case "groupid":
                    main.setGroupId(argValue);
                    break;
                case "servers":
                    main.setServers(argValue);
                    break;
                case "topics":
                    main.setTopics(argValue.split(","));
                    break;
            }
        }
        main.createConsumer();
        new Thread(main).start();
        try (Scanner scanner = new Scanner(System.in)) {
            while(true) {
                String line = scanner.nextLine();
                if (line.equals("stop")) {
                    main.setKeepRunning(false);
                    break;
                }
            }
        }
    }
}

I've started a kafka server using default settings and a kafka producer using the shell tool kafka-console-producer.sh to write messages to my topic. 我已经使用默认设置启动了一个kafka服务器,并使用了shell工具kafka-console-producer.sh启动了一个kafka生产者,以将消息写入我的主题。 Then I connect with two consumers using this code, sending the proper server to connect and topic to subscribe, everything else with default values, which means both consumers have the same group id. 然后,我使用此代码与两个使用者进行连接,发送适当的服务器进行连接并进行主题订阅,其他所有内容均具有默认值,这意味着两个使用者都具有相同的组ID。 I notice that only one of my consumers consumes all the data. 我注意到, 只有我的一个消费者使用了所有数据。 I've read that the default behaviour should be that the consumers must be balanced by the server, from the official tutorial : 官方教程中 ,我已经读到默认行为应该是消费者必须由服务器平衡:

If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers. 如果所有使用者实例都具有相同的使用者组,则这就像在使用者上使用传统队列平衡负载一样。

How can I fix the consumers to behave like the default? 如何修复使用者的行为使其类似于默认行为? Or maybe I'm missing something? 还是我想念什么?

there is trait kafka.consumer.PartitionAssignor that says how partitions should be assigned per consumers. 有一个特征kafka.consumer.PartitionAssignor,它说明应如何为每个使用者分配分区。 It has two immplementations: RoundRobinAssignor and RangeAssignor. 它有两个实现:RoundRobinAssignor和RangeAssignor。 The default one is RangeAssignor. 默认值是RangeAssignor。

Can be changed by setting param "partition.assignment.strategy". 可以通过设置参数“ partition.assignment.strategy”来更改。

Round Robin documentation: Round Robin文档:

The roundrobin assignor lays out all the available partitions and all the available consumers. 循环分配器对所有可用分区和所有可用使用者进行布局。 It then proceeds to do a roundrobin assignment from partition to consumer. 然后,它继续进行从分区到使用者的循环分配。 If the subscriptions of all consumer instances are identical, then the partitions will be uniformly distributed. 如果所有使用者实例的订阅都相同,则分区将均匀分布。 (ie, the partition ownership counts will be within a delta of exactly one across all consumers.) For example, suppose there are two consumers C0 and C1, two topics t0 and t1, and each topic has 3 partitions, resulting in partitions t0p0, t0p1, t0p2, t1p0, t1p1, and t1p2. (即,分区拥有者计数将在所有使用者中的一个增量之内。)例如,假设有两个使用者C0和C1,两个主题t0和t1,并且每个主题都有3个分区,从而得出分区t0p0, t0p1,t0p2,t1p0,t1p1和t1p2。 The assignment will be: C0: [t0p0, t0p2, t1p1] C1: [t0p1, t1p0, t1p2] 分配为:C0:[t0p0,t0p2,t1p1] C1:[t0p1,t1p0,t1p2]

Range Assignor documentation 范围分配器文档

The range assignor works on a per-topic basis. 范围分配器基于每个主题工作。 For each topic, we lay out the available partitions in numeric order and the consumers in lexicographic order. 对于每个主题,我们以数字顺序排列可用分区,并以字典顺序排列使用者。 We then divide the number of partitions by the total number of consumers to determine the number of partitions to assign to each consumer. 然后,我们将分区数除以使用者总数,以确定分配给每个使用者的分区数。 If it does not evenly divide, then the first few consumers will have one extra partition. 如果它没有均匀划分,那么前几个消费者将有一个额外的划分。 For example, suppose there are two consumers C0 and C1, two topics t0 and t1, and each topic has 3 partitions, resulting in partitions t0p0, t0p1, t0p2, t1p0, t1p1, and t1p2. 例如,假设有两个使用者C0和C1,两个主题t0和t1,并且每个主题都有3个分区,从而得出分区t0p0,t0p1,t0p2,t1p0,t1p1和t1p2。 The assignment will be: C0: [t0p0, t0p1, t1p0, t1p1] C1: [t0p2, t1p2] 分配为:C0:[t0p0,t0p1,t1p0,t1p1] C1:[t0p2,t1p2]

So, if all our topics have only one partition, only one consumer will work 因此,如果我们所有主题都只有一个分区,那么只有一个使用者可以工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 可以让具有相同group.id的kafka消费者分别使用不同的主题 - can kafka consumers with same group.id consume different topics separately Kafka - 发布给组中的所有消费者 - Kafka - publish to all consumers in a group 如何在同一台机器上运行数百名Kafka消费者? - How to run hundreds of Kafka consumers on the same machine? 同一主题上 2 个 kafka 消费者的分区结构 - Partition structure for 2 kafka consumers on same topic 具有相同组ID的Kafka使用者线程消耗相同记录 - Kafka consumer threads with same group id consuming same record Apache Kafka放弃了我的制作人和消费者 - Apache Kafka dropping my Producers and Consumers 为什么 Kafka 消费者会丢失他们的数据? - Why do Kafka Consumers lose their data? 同一消费者组中的多个消费者。 client.id 如何分配给每个消费者? - Multiple consumers in same consumer group. How does client.id get assigned to each consumer? Apache Kafka:3个分区,消费组有3个消费者,每个消费者都应该是多线程的 - Apache Kafka: 3 partitions, 3 consumers in the consumer group, each consumer should be multithreaded 如何在同一个盒子上独立运行多个kafka消费者? - How to run multiple kafka consumers on the same box independent of each other?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM