[英]How to programmatically get latest offset per Kafka topic partition in Python
我是 Kafka 的新手,想按分区获取 Kafka 主题的 position。 I see in the documentation - https://kafka- python.readthedocs.io/en/master/apidoc/KafkaAdminClient.html#kafkaadminclient - that the offset is available via the function KafkaAdminClient.list_consumer_group_offsets
, but I don't see such a position 的方法。
有人知道我怎么能得到它吗?
您可以使用position
:
检索分区列表的当前位置(偏移量)。
from confluent_kafka import Consumer, TopicPartition
consumer = Consumer({"bootstrap.servers": "localhost:9092"})
topic = consumer.list_topics(topic='topicName')
partitions = [TopicPartition('topicName', partition) for partition in list(topic.topics['topicName'].partitions.keys())]
offset_per_partition = consumer.position(partitions)
或者,您也可以使用get_watermark_offsets
但您必须一次通过一个分区,因此需要多次调用:
检索分区的低偏移量和高偏移量。
from confluent_kafka import Consumer, TopicPartition
consumer = Consumer({"bootstrap.servers": "localhost:9092"})
topic = consumer.list_topics(topic='topicName')
partitions = [TopicPartition('topicName', partition) for partition in list(topic.topics['topicName'].partitions.keys())]
for p in partitions:
low_offset, high_offset = consumer.get_watermark_offsets(p)
print(f"Latest offset for partition {f}: {high_offset}")
您可以使用end_offsets
:
获取给定分区的最后一个偏移量。 一个分区的最后一个偏移量是即将到来的消息的偏移量,即最后一条可用消息的偏移量+1。
此方法不会更改分区的当前消费者 position。
from kafka import TopicPartition
from kafka.consumer import KafkaConsumer
consumer = KafkaConsumer(bootstrap_servers = "localhost:9092" )
partitions= = [TopicPartition('myTopic', p) for p in consumer.partitions_for_topic('myTopic')]
last_offset_per_partition = consumer.end_offsets(partitions)
如果您想遍历所有主题,则以下方法可以解决问题:
from kafka import TopicPartition
from kafka.consumer import KafkaConsumer
kafka_topics = consumer.topics()
for topic in kafka_topics:
partitions= = [TopicPartition(topic, p) for p in consumer.partitions_for_topic(topic)]
last_offset_per_partition = consumer.end_offsets(partitions)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.