I have python kafka consumer, auto_commit
set to False
, I am committing messages manually. However after restart, consumer is consuming the last message from each partition again. Only the last one, not more.
This is what kafka-consumer-groups
shows:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
my-topic 0 0 1 1
my-topic 1 3 4 1
I don't know why it shows lag, and whu current offset is set to the last message instead of next one? When I commit offset 3, shouldn't current offset be moved to 4?
I commit every message I consume, but then on restart, it always consumes the last message again.
EDIT: This is the code I use:
self.subscriber = kafka.KafkaConsumer(self.consumer_topic,
client_id=self.consumer_name, group_id=group_id,
bootstrap_servers=self.consumer_bootstrap_server,
consumer_timeout_ms=timeout_ms, enable_auto_commit=False)
for record in self.subscriber:
offset = CommittableOffset(record.topic, record.partition, record.offset)
# process message
partition = TopicPartition(record.topic, record.partition)
offset = OffsetAndMetadata(record.offset, None)
self.subscriber.commit({partition:offset})
It turns out python kafka library works in a little different way than Java/Scala libs I was used to. In Java/Scala lib when I commit a message actually it's message offset + 1 commited. In kafka-python lib I have to add 1 myself to the offset.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.