[英]Kafka reset tool Consumer offset not resetting to zero
I am trying to understand some fundamental Kafka
concepts so that I can properly monitor the progress of my KafkaStreams based application. 我试图了解一些基本的Kafka
概念,以便可以正确地监视基于KafkaStreams的应用程序的进度。
Specifically for debugging purposes I need to be able to have my application re-consume a whole topic. 专门用于调试目的,我需要能够让我的应用程序重新使用整个主题。 For that I used the reset tool . 为此,我使用了重置工具 。
After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset
has decreased and the Lag
has increased (which makes sense). 在执行了脚本并查看了Kafka Manager中的一些输入主题之后,我发现Consumer Offset
减少了,而Lag
则增加了(这很有意义)。 Although the Consumer Offset
is not going to zero. 尽管“ Consumer Offset
不会为零。 I am trying to interpret that but I haven't found a concrete explanation of what the Consumer Offset
and Logsize
in Kafka Manager are referring to. 我试图解释这一点,但是我还没有找到关于Kafka Manager中的“ Consumer Offset
和“ Logsize
的具体解释。
To make it fit what I see I assume that the Logsize
is the total amount of messages placed into the topic since it's beginning but not necessarily the amount of messages currently in the topic. 为了使它适合我所看到的内容,我假设Logsize
是自主题开始以来放入主题的消息总量,但不一定是主题中当前的消息数量。 As some may have been thrown away due their age exceeding the retention period. 由于有些人可能由于年龄超过保留期限而被扔掉了。 Am I right? 我对吗?
If not, then what is the explanation behind the fact that after running the reset tool for some input topics I observe that the Consumer Offset
is equal to the Logsize
(and not zero) and Lag
is zero? 如果不是,那么在为某些输入主题运行重置工具之后,我观察到Consumer Offset
等于Logsize
(而不是零)并且Lag
为零,这一事实背后的解释是什么?
I am not familiar with yahoo-kafka-manager
, however, you can also use bin/kafka-consumer-groups.sh
(a tool shipped with Kafka itself). 我对yahoo-kafka-manager
并不熟悉,但是,您也可以使用bin/kafka-consumer-groups.sh
(Kafka随附的工具)。 There LOG-END-OFFSET means what you describe. LOG-END-OFFSET表示您所描述的内容。 From a naming perspective it's unclear to me if Logsize
is the same as "log end offset" or the difference between highest and lowest offset in a partition. 从命名角度来看,我不清楚Logsize
是否与“ log end offset”相同或分区中最高偏移量与最低偏移量之差。
After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset has decreased and the Lag has increased. 在执行脚本后,查看了Kafka Manager中的一些输入主题,我发现Consumer Offset减少了,而Lag增加了。
This makes sense -- as "lag" is difference of "log end offset" and "committed offset" the lag should be increased after resetting you applications. 这是有道理的-由于“滞后”是“对数结束偏移量”和“承诺的偏移量”的差,因此应在重置应用程序后增加滞后。
However, I am not sure why committed consumer group offset is not zero (can you very what you observe using
bin/kafka-consumer-group.sh
-- maybe
yahoo-kafka-manager
report something different).
但是,我不确定为什么承诺的消费者组偏移量不为零(您是否可以使用
bin/kafka-consumer-group.sh
观察到-也许
yahoo-kafka-manager
报告的内容有所不同)。
Update: however the tool will not set the offset to zero but to "beginning of log". 更新:但是该工具不会将偏移量设置为零,而是将其设置为“日志开始”。 (The docs are not correct.) (文档不正确。)
Also note, that auto.offset.reset
strategy might tick in after you reset your applications and restart it ([committed] offset
zero
might not be valid if log got truncated). 另请注意,在重置应用程序并重新启动后, auto.offset.reset
策略可能会勾选(如果日志被截断,[committed] offset
0
可能无效)。 Could this explain the behavior you observe? 这可以解释您观察到的行为吗?
This blog post might also help to understand further details: https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/ 该博客文章也可能有助于了解更多详细信息: https : //www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.