简体   繁体   English

Kafka重置工具使用者偏移量未重置为零

[英]Kafka reset tool Consumer offset not resetting to zero

I am trying to understand some fundamental Kafka concepts so that I can properly monitor the progress of my KafkaStreams based application. 我试图了解一些基本的Kafka概念,以便可以正确地监视基于KafkaStreams的应用程序的进度。

Specifically for debugging purposes I need to be able to have my application re-consume a whole topic. 专门用于调试目的,我需要能够让我的应用程序重新使用整个主题。 For that I used the reset tool . 为此,我使用了重置工具

After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset has decreased and the Lag has increased (which makes sense). 在执行了脚本并查看了Kafka Manager中的一些输入主题之后,我发现Consumer Offset减少了,而Lag则增加了(这很有意义)。 Although the Consumer Offset is not going to zero. 尽管“ Consumer Offset不会为零。 I am trying to interpret that but I haven't found a concrete explanation of what the Consumer Offset and Logsize in Kafka Manager are referring to. 我试图解释这一点,但是我还没有找到关于Kafka Manager中的“ Consumer Offset和“ Logsize的具体解释。

To make it fit what I see I assume that the Logsize is the total amount of messages placed into the topic since it's beginning but not necessarily the amount of messages currently in the topic. 为了使它适合我所看到的内容,我假设Logsize是自主题开始以来放入主题的消息总量,但不一定是主题中当前的消息数量。 As some may have been thrown away due their age exceeding the retention period. 由于有些人可能由于年龄超过保留期限而被扔掉了。 Am I right? 我对吗?

If not, then what is the explanation behind the fact that after running the reset tool for some input topics I observe that the Consumer Offset is equal to the Logsize (and not zero) and Lag is zero? 如果不是,那么在为某些输入主题运行重置工具之后,我观察到Consumer Offset等于Logsize (而不是零)并且Lag为零,这一事实背后的解释是什么?

I am not familiar with yahoo-kafka-manager , however, you can also use bin/kafka-consumer-groups.sh (a tool shipped with Kafka itself). 我对yahoo-kafka-manager并不熟悉,但是,您也可以使用bin/kafka-consumer-groups.sh (Kafka随附的工具)。 There LOG-END-OFFSET means what you describe. LOG-END-OFFSET表示您所描述的内容。 From a naming perspective it's unclear to me if Logsize is the same as "log end offset" or the difference between highest and lowest offset in a partition. 从命名角度来看,我不清楚Logsize是否与“ log end offset”相同或分区中最高偏移量与最低偏移量之差。

After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset has decreased and the Lag has increased. 在执行脚本后,查看了Kafka Manager中的一些输入主题,我发现Consumer Offset减少了,而Lag增加了。

This makes sense -- as "lag" is difference of "log end offset" and "committed offset" the lag should be increased after resetting you applications. 这是有道理的-由于“滞后”是“对数结束偏移量”和“承诺的偏移量”的差,因此应在重置应用程序后增加滞后。 However, I am not sure why committed consumer group offset is not zero (can you very what you observe using bin/kafka-consumer-group.sh -- maybe yahoo-kafka-manager report something different). 但是,我不确定为什么承诺的消费者组偏移量不为零(您是否可以使用 bin/kafka-consumer-group.sh观察到-也许 yahoo-kafka-manager报告的内容有所不同)。

Update: however the tool will not set the offset to zero but to "beginning of log". 更新:但是该工具不会将偏移量设置为零,而是将其设置为“日志开始”。 (The docs are not correct.) (文档不正确。)

Also note, that auto.offset.reset strategy might tick in after you reset your applications and restart it ([committed] offset zero might not be valid if log got truncated). 另请注意,在重置应用程序并重新启动后, auto.offset.reset策略可能会勾选(如果日志被截断,[committed] offset 0 可能无效)。 Could this explain the behavior you observe? 这可以解释您观察到的行为吗?

This blog post might also help to understand further details: https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/ 该博客文章也可能有助于了解更多详细信息: https : //www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM