简体   繁体   中英

Kafka Broker offsets/log retention and consumers offset reset in earliest mode

Problem description:

Our Kafka consumer (developed in Spring Boot 2.x) are executing along several days. When we restart those consumer all messages of the topic are consumed again, but only under especific conditions.

Conditions:

We supose that the combination broker/topic config ( log.retention.* , offsets.retention.* ) and consumer config ( auto.offset.reset = earliest ) are causing this behavior.
Obviously we can't set consumer to "latest" , because if the consumer is stopped and new messages arrives, when the consumer start again, those messages won't be consumed.

Question:

What is the correct setup to avoid this situation?
In last Kafka Broker release (2.x) the default values for log.retention.* and offsets.retention.* are the same ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-186%3A+Increase+offsets+retention+default+to+7+days )

Could this new configuration setup solve the problem?

Consumer configuration ( auto.commit delegated on Spring Cloud Stream Framework):

           auto.commit.interval.ms = 100
           auto.offset.reset = earliest
           bootstrap.servers = [server1:9092]
           check.crcs = true
           client.id = 
           connections.max.idle.ms = 540000
           enable.auto.commit = false
           exclude.internal.topics = true
           fetch.max.bytes = 52428800
           fetch.max.wait.ms = 500
           fetch.min.bytes = 1
           group.id = consumer_group1
           heartbeat.interval.ms = 3000
           interceptor.classes = null
           internal.leave.group.on.close = true
           isolation.level = read_uncommitted
           key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
           max.partition.fetch.bytes = 1048576
           max.poll.interval.ms = 300000
           max.poll.records = 500
           metadata.max.age.ms = 300000
           metrics.recording.level = INFO
           metrics.sample.window.ms = 30000
           partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
           receive.buffer.bytes = 65536
           reconnect.backoff.max.ms = 1000
           reconnect.backoff.ms = 50
           request.timeout.ms = 305000
           retry.backoff.ms = 100
           value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer

Brokers configuration:

           log.retention.ms = 86400000
           log.retention.minutes = 10080
           log.retention.hours = 168
           log.retention.bytes = -1

           offsets.retention.ms = 864000000
           offsets.retention.minutes = 14400
           offsets.retention.hours = 240 

           unclean.leader.election.enable = false
           log.cleaner.enable = true
           auto.leader.rebalance.enable = true
           leader.imbalance.check.interval.seconds = 300
           log.retention.check.interval.ms = 300000
           log.cleaner.delete.retention.ms = 604800000

Thanks and regards

You are right, you experiencing this issue due to different values for log.retention.* and offsets.retention.* (7 days and 1 day respectively) for Kafka versions prior to 2.0, please check description here . it's due to rare messages coming into your topic, and offset data already expired.

it's not totally correct regarding your phrase Obviously we can't set consumer to "latest" . if you received last messages less than 1 day before (like few hours before), you could safely update auto.offset.reset value to latest , and with the same group id (or application.id ). in such case you will not lose messages.

As another option, you could change log retention value for a specific topic to 1 day. Also you could update value offsets.retention.* , but with that you need to test it from a performance point of you, it might be degraded.

If you keep your application running 24x7 (eg over the weekend when there is no data), one option would be to set an idleInterval and add an ApplicationListener (or @EventListener ) to listen for ListenerContainerIdleEvent s.

Then, if the idleTime property is approaching your log retention, you can re-commit the offsets using the Consumer in the event - get the assigned partitions, find their current position() and then re-commit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM