简体   繁体   English

在重新连接时始终从 Kafka 检索最新消息

[英]Retrieve always latest messages from Kafka on reconnection

I'm writing a piece of code that needs to read hundreds of messages from Kafka each few milliseconds.我正在编写一段代码,需要每隔几毫秒从 Kafka 读取数百条消息。 I'm using C++ and librdkafka.我正在使用 C++ 和 librdkafka。 When my program stops and then restarts it does not need to recover all lost messages since it was stopped, but instead it needs to always read from latest messages sent.当我的程序停止然后重新启动时,它不需要恢复自停止以来所有丢失的消息,而是需要始终从发送的最新消息中读取。

As far as I know I can manage consumer offsets by playing with enable.auto.commit and auto.offset.reset .据我所知,我可以通过使用enable.auto.commitauto.offset.reset来管理消费者偏移量。 But, the latter one is only useful when there are no committed offsets while the former one instead let me manage myself the offsets to store.但是,后者仅在没有提交的偏移量时才有用,而前者让我自己管理要存储的偏移量。

Playing with these two values I found that if I set enable.auto.commit to false , without committing any offset, and auto.offset.reset to latest it seems to retrieve always the latest messages;使用这两个值,我发现如果我将enable.auto.commit设置为false ,而不提交任何偏移量,并将auto.offset.resetlatest它似乎总是检索最新消息; but how clean is this solution?但是这个解决方案有多干净?

My fear is that if between two consumer polls there are 2 messages sent than my consumer only takes the latest, or if no messages where sent it continually reads the same.我担心的是,如果在两个消费者轮询之间发送了 2 条消息,而我的消费者只接收最新的消息,或者如果没有发送的消息,它会持续读取相同的消息。 Both are unwanted behaviour.两者都是不受欢迎的行为。

Another idea was to clear consumer group offsets or seeking forward, but the seek method in librdkafka seems to not work as needed and I cannot find methods to manage consumer groups..另一个想法是清除消费者组偏移或向前搜索,但 librdkafka 中的seek方法似乎无法按需要工作,我找不到管理消费者组的方法。

How can I always read latest messages from Kafka using librdkafka?如何使用 librdkafka 始终阅读来自 Kafka 的最新消息?

Finally I solved by managing myself the callback on rebalance.最后我通过管理自己的重新平衡回调来解决。 This callback will be always executed when a new consumer join or leave the group.当新的消费者加入或离开组时,将始终执行此回调。

The rebalance callback is responsible for updating librdkafka's assignment set based on the two events: RdKafka::ERR__ASSIGN_PARTITIONS and RdKafka::ERR__REVOKE_PARTITIONS. rebalance 回调负责根据两个事件更新 librdkafka 的分配集:RdKafka::ERR__ASSIGN_PARTITIONS 和 RdKafka::ERR__REVOKE_PARTITIONS。

So within the rebalance callback I iterate over the TopicPartition s in order to assign them to the consumer, using the latest offsets.因此,在重新平衡回调中,我迭代了TopicPartition s,以便使用最新的偏移量将它们分配给消费者。 The snippet of code is this:代码片段是这样的:

class SeekEndRebalanceCb : public RdKafka::RebalanceCb {
  public:
  void rebalance_cb (RdKafka::KafkaConsumer *consumer, RdKafka::ErrorCode err, std::vector<RdKafka::TopicPartition*> &partitions) {
    if (err == RdKafka::ERR__ASSIGN_PARTITIONS) {
      for (auto partition = partitions.begin(); partition != partitions.end(); partition++) {
        (*partition)->set_offset(RdKafka::Topic::OFFSET_END);
      }
      consumer->assign(partitions);
    } else if (err == RdKafka::ERR__REVOKE_PARTITIONS) {
      consumer->unassign();
    } else {
      std::cerr << "Rebalancing error: " << RdKafka::err2str(err) << std::endl;
    }
  }
};

In order to use that callback I will set it to the consumer.为了使用该回调,我将其设置为消费者。

SeekEndRebalanceCb ex_rb_cb;
if (consumer->set("rebalance_cb", &ex_rb_cb, errstr) != RdKafka::Conf::CONF_OK) {
  std::cerr << errstr << std::endl;
  return false;
}

The consumer->assign(partitions) should be invoked after the end of cycle for consumer->assign(partitions) 应在循环结束后调用

class SeekEndRebalanceCb : public RdKafka::RebalanceCb {
  public:
  void rebalance_cb (RdKafka::KafkaConsumer *consumer, RdKafka::ErrorCode err, std::vector<RdKafka::TopicPartition*> &partitions) {
    if (err == RdKafka::ERR__ASSIGN_PARTITIONS) {
       for (auto partition = partitions.begin(); partition != partitions.end(); partition++) 
           (*partition)->set_offset(RdKafka::Topic::OFFSET_END);
       consumer->assign(partitions);
    } else if (err == RdKafka::ERR__REVOKE_PARTITIONS) {
      consumer->unassign();
    } else {
      std::cerr << "Rebalancing error: " << RdKafka::err2str(err) << std::endl;
    }
  }
};

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM