简体   繁体   English

Kafka 主题中的消息数

[英]Kafka Number of messages in a topic

I need the number of messages in a kafka topic stored.我需要存储的 kafka 主题中的消息数。 This is not concerned with whether any consumer has consumed the messages or not.这与是否有任何消费者消费了消息无关。

kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092,localhost:9093,localhost:9094 --topic test-topic

The above gives the offset number for the topic?上面给出了主题的偏移量?

Is the above equal to the number of messages currently stored in the kafka topic?以上是否等于kafka主题中当前存储的消息数?

Not exactly.不完全是。 The numbers you got only refers to the current max offsets of all the topic partitions.您得到的数字仅指所有主题分区的当前最大偏移量。 Message count also depends on the partitions' beginning offsets for that topic.消息计数还取决于该主题的分区的起始偏移量。

You could run你可以跑

kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092,localhost:9093,localhost:9094 --topic test-topic --time -1

and

kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092,localhost:9093,localhost:9094 --topic test-topic --time -2

respectively, and calculate the message count for each partition by subtracting beginningOffsets from endOffsets, then sum them up to get the total record count for that topic.分别,并通过从 endOffsets 中减去 beginOffsets 来计算每个分区的消息数,然后将它们相加以获得该主题的总记录数。

是的,如果最早偏移量等于零,则这等于消息数。如果最早偏移量不等于零,则需要计算差异,然后对每个分区求和。

The above gives the offset number for the topic?上面给出了主题的偏移量? Yes it gives the current max offset是的,它给出了当前的最大偏移量

Is the above equal to the number of messages currently stored in the kafka topic?以上是否等于kafka主题中当前存储的消息数? No, it's not the number of messages in the kafka as after retention period messages will be deleted from topic so offset != count of messages不,这不是 kafka 中的消息数,因为在保留期后消息将从主题中删除,因此偏移量!= 消息数

To get number of messages in kafka获取kafka中的消息数

    brokers="<broker1:port>"
topic=<topic-name>
sum_1=$(/usr/hdp/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list $brokers --topic $topic --time -1 | grep -e ':[[:digit:]]*:' | awk -F  ":" '{sum += $3} END {print sum}')
sum_2=$(/usr/hdp/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list $brokers --topic $topic --time -2 | grep -e ':[[:digit:]]*:' | awk -F  ":" '{sum += $3} END {print sum}')
echo "Number of records in topic ${topic}: "$((sum_1 - sum_2))

where option --time -1 => current max offset & --time -2 is current min offset其中选项 --time -1 => 当前最大偏移 & --time -2 是当前最小偏移

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM