简体   繁体   English

可靠地获取来自 Kafka 主题的最后一条(已经产生的)消息

[英]Reliably get the last (already produced) message from Kafka topic

I am doing something like the following pseudo code我正在做类似下面的伪代码

var consumer = new KafkaConsumer();
consumer.assign(topicPartitions);
var beginOff = consumer.beginningOffsets(topicPartitions);
var endOff = consumer.endOffsets(topicPartitions);
var lastOffsets = Math.max(beginOff, endOff - 1));
lastOffsets.forEach(consumer::seek);
lastMessages = consumer.poll(1 sec);
// do something with the received messages
consumer.close();

In the simple test that I did, this works, but I wonder if there are cases, like producer crashes etc., where offsets are not monotonically increasing by one?在我所做的简单测试中,这是可行的,但我想知道是否存在一些情况,比如生产者崩溃等,偏移量不是单调递增的? In that case, would I have to seek() my way back in time, or can I get the message offset of the last already produced message from Kafka?在这种情况下,我是否必须及时返回seek() ,或者我可以从 Kafka 获取最后一条已经生成的消息的消息偏移量?

I am not using transactions, so we don't need to worry about read-committed vs. uncommitted messages.我没有使用事务,所以我们不需要担心已提交的和未提交的消息。

Edit: An example where offsets are not consecutive is after log compaction.编辑:偏移量不连续的一个例子是在日志压缩之后。 However, log compaction should always keep the last message, as it is - obviously - more recent than all preceding messages (same key or not).但是,日志压缩应始终保留最后一条消息,因为它 - 显然 - 比所有先前的消息(相同或不同的键)更新。 But the offset before that last message could theoretically have been compacted away.但是理论上可以压缩最后一条消息之前的偏移量。

Kafka 日志压缩

In kafka.apache.org/10/javadoc/ , it is clearly mentioned that, consumer.endOffsetskafka.apache.org/10/javadoc/中,明确提到, consumer.endOffsets

Get the last offset for the given partitions. The last offset of a partition is the offset of the upcoming message, ie the offset of the last available message + 1.

So when you get that endOff - 1 , it is the last available Kafka record for that topic partition when you fetched that.因此,当您获得endOff - 1时,它是您获取该主题分区时最后一个可用的 Kafka 记录。 So producer concerns are not impacted for this.因此,生产者的担忧不会因此受到影响。

And one more thing, Offset is not decided by the producer.还有一件事,Offset 不是由制片人决定的。 It is decided by the partition leader of that topic partition.由该主题分区的分区领导者决定。 So, it is always monotonically increasing by one.所以,它总是单调递增的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM