简体   繁体   English

如何在 Kafka Sink Connector 中手动提交偏移量

[英]How to commit offset manually in Kafka Sink Connector

I have a Kafka Sink Task which is listening to a Kafka topic via put() method.我有一个 Kafka Sink Task,它正在通过put()方法收听 Kafka 主题。
But I do not want to auto commit the offset as I have some processing logic once record is fetched from Kafka.但是我不想自动提交偏移量,因为一旦从 Kafka 获取记录,我就有一些处理逻辑。
After fetching the records from Kafka, if the processing is successful then only I want to commit the offset else it should read from the same offset again.从 Kafka 获取记录后,如果处理成功,那么我只想提交偏移量,否则它应该再次从相同的偏移量读取。

I can see there is method commitSync() in Kafka consumer but cannot find an alternative in Sink Connector for the same.我可以看到 Kafka 消费者中有方法commitSync()但在Sink Connector中找不到相同的替代方法。

Sink Kafka Connector-Commit Sink Kafka 连接器-提交

If the option( enable.auto.commit ) is False, automatically commit every 60 seconds according to the option( offset.flush.interval.ms ) below.如果选项( enable.auto.commit )为False,则根据下面的选项( offset.flush.interval.ms )每60秒自动提交一次。 and if there is no error in your put() method, it will be committed normally.如果你的put()方法没有错误,它会正常提交。

offset.flush.interval.ms
Interval at which to try committing offsets for tasks.

Type: long
Default: 60000
Importance: low

To manage offset in Sink Kafka在 Sink Kafka 中管理偏移量

Kafka Connect should commit all the offsets it passed to the connector via preCommit. Kafka Connect 应该提交它通过 preCommit 传递给连接器的所有偏移量。 But if your preCommit returns an empty set of offsets, then Kafka Connect will record no offsets at all.但是,如果您的 preCommit 返回一组空的偏移量,则 Kafka Connect 将根本不记录任何偏移量。 enter link description here 在此处输入链接描述

SinkTask.java

/**
 * Pre-commit hook invoked prior to an offset commit.
 *
 * The default implementation simply invokes {@link #flush(Map)} and is thus able to assume all {@code currentOffsets} are committable.
 *
 * @param currentOffsets the current offset state as of the last call to {@link #put(Collection)}},
 *                       provided for convenience but could also be determined by tracking all offsets included in the {@link SinkRecord}s
 *                       passed to {@link #put}.
 *
 * @return an empty map if Connect-managed offset commits are not desired, otherwise a map of committable offsets by topic-partition.
 */
public Map<TopicPartition, OffsetAndMetadata> preCommit(Map<TopicPartition, OffsetAndMetadata> currentOffsets) {
    flush(currentOffsets);
    return currentOffsets;
}

or或者

SinkTaskContext.java

/**
 * Request an offset commit. Sink tasks can use this to minimize the potential for redelivery
 * by requesting an offset commit as soon as they flush data to the destination system.
 *
 * This is a hint to the runtime and no timing guarantee should be assumed.
 */
void requestCommit();

Add this property: (" enable.auto.commit ", " false ")添加这个属性:(“ enable.auto.commit ”,“ false ”)

enable.auto.commit has a default value of true and a second property auto.commit.interval.ms has a default value of 5000 enable.auto.commit 的默认值为true ,第二个属性 auto.commit.interval.ms 的默认值为5000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM