简体   繁体   English

Kafka CommitFailedException消费者异常

[英]Kafka CommitFailedException consumer exception

After create multiple consumers (using Kafka 0.9 java API) and each thread started, I'm getting the following exception 在创建多个使用者(使用Kafka 0.9 java API)并且每个线程启动后,我得到以下异常

Consumer has failed with exception: org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed due to group rebalance
class com.messagehub.consumer.Consumer is shutting down.
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed due to group rebalance
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:546)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:487)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:681)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:654)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:350)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:288)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:157)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:352)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:936)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:905)

and then start consuming message normally, I would like to know what is causing this exception in order to fix it. 然后开始正常消费消息,我想知道导致此异常的原因是为了解决它。

Try also to tweak the following parameters: 还要尝试调整以下参数:

  • heartbeat.interval.ms - This tells Kafka wait the specified amount of milliseconds before it consider the consumer will be considered "dead" heartbeat.interval.ms - 这告诉Kafka在考虑将消费者视为“死亡”之前等待指定的毫秒数
  • max.partition.fetch.bytes - This will limit the amount of messages (up to) the consumer will receive when polling. max.partition.fetch.bytes - 这将限制消费者在轮询时将收到的消息量(最多)。

I noticed that the rebalancing occurs if the consumer does not commit to Kafka before the heartbeat times out. 我注意到,如果消费者在心跳超时之前没有提交Kafka,则会发生重新平衡。 If the commit occurs after the messages are processed, the amount of time to process them will determine these parameters. 如果在处理消息后发生提交,则处理它们的时间将决定这些参数。 So, decreasing the number of messages and increasing the heartbeat time will help to avoid rebalancing. 因此,减少消息数量和增加心跳时间将有助于避免重新平衡。

Also consider to use more partitions, so there will be more threads processing your data, even with less messages per poll. 还要考虑使用更多分区,因此将有更多线程处理您的数据,即使每次轮询的消息更少。

I wrote this small application to make tests. 我写了这个小应用程序来进行测试。 Hope it helps. 希望能帮助到你。

https://github.com/ajkret/kafka-sample https://github.com/ajkret/kafka-sample

UPDATE UPDATE

Kafka 0.10.x now offers a new parameter to control the number of messages received: - max.poll.records - The maximum number of records returned in a single call to poll(). Kafka 0.10.x现在提供了一个新参数来控制收到的消息数量: - max.poll.records - 一次调用poll()时返回的最大记录数。

UPDATE UPDATE

Kafka offers a way to pause the queue. Kafka提供了一种暂停队列的方法。 While the queue is paused, you can process the messages in a separated Thread, allowing you to call KafkaConsumer.poll() to send heartbeats. 队列暂停时,您可以在单独的线程中处理消息,允许您调用KafkaConsumer.poll()来发送心跳。 Then call KafkaConsumer.resume() after the processing is done. 然后在处理完成后调用KafkaConsumer.resume() This way you mitigate the problems of causing rebalances due to not sending heartbeats. 这样可以缓解因不发送心跳而导致重新平衡的问题。 Here is an outline of what you can do : 以下是您可以执行的操作的概述:

while(true) {
    ConsumerRecords records = consumer.poll(Integer.MAX_VALUE);
    consumer.commitSync();

    consumer.pause();
    for(ConsumerRecord record: records) {

        Future<Boolean> future = workers.submit(() -> {
            // Process
            return true;
        }); 


       while (true) {
            try {
                if (future.get(1, TimeUnit.SECONDS) != null) {
                    break;
                }
            } catch (java.util.concurrent.TimeoutException e) {
                getConsumer().poll(0);
            }
        }
    }

    consumer.resume();
}

Its consumer group re-balancing issue as the errors says. 正如错误所述,其消费者群体重新平衡问题。 Can you tell us, How many partitions topic was created with? 你能告诉我们,创建了多少分区主题? and How many consumers are running? 有多少消费者在运营? Are they belong to same group? 他们属于同一个群体吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM