简体   繁体   English

Kafka只订阅最新消息?

[英]Kafka only subscribe to latest message?

Sometimes(seems very random) Kafka sends old messages.有时(似乎很随机)Kafka 会发送旧消息。 I only want the latest messages so it overwrite messages with the same key.我只想要最新的消息,所以它会用相同的密钥覆盖消息。 Currently it looks like I have multiple messages with the same key it doesn't get compacted.目前看起来我有多个消息使用相同的密钥,它没有被压缩。

I use this setting in the topic:我在主题中使用此设置:

cleanup.policy=compact

I'm using Java/Kotlin and Apache Kafka 1.1.1 client.我正在使用 Java/Kotlin 和 Apache Kafka 1.1.1 客户端。

Properties(8).apply {
    val jaasTemplate = "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"%s\" password=\"%s\";"
    val jaasCfg = String.format(jaasTemplate, Configuration.kafkaUsername, Configuration.kafkaPassword)
    put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
            BOOTSTRAP_SERVERS)
    put(ConsumerConfig.GROUP_ID_CONFIG,
            "ApiKafkaKotlinConsumer${Configuration.kafkaGroupId}")
    put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
            StringDeserializer::class.java.name)
    put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
            StringDeserializer::class.java.name)

    put("security.protocol", "SASL_SSL")
    put("sasl.mechanism", "SCRAM-SHA-256")
    put("sasl.jaas.config", jaasCfg)
    put("max.poll.records", 100)
    put("receive.buffer.bytes", 1000000)
}

Have I missed some settings?我错过了一些设置吗?

If you want have only one value for each key, you have to use KTable<K,V> abstraction: StreamsBuilder::table(final String topic) from Kafka Streams .如果您希望每个键只有一个值,则必须使用KTable<K,V>抽象:来自Kafka Streams 的StreamsBuilder::table(final String topic) Topic used here should have cleanup policy set to compact .此处使用的主题应将清理策略设置为compact

If you use KafkaConsumer you just pull data from brokers.如果您使用 KafkaConsumer,您只需从代理中提取数据。 It doesn't give you any mechanism that perform some kind of deduplication.它没有为您提供任何执行某种重复数据删除的机制。 Depending on if compaction was performed or not, you can get one to n messages for same key.根据是否执行了压缩,您可以为同一个键获得n 条消息。

Regarding compaction关于压实

Compaction doesn't mean, that all old value for same key are removed immediately.压缩并不意味着立即删除同一键的所有旧值。 When old message for same key will be removed, depends on several properties.何时删除相同密钥的old消息,取决于几个属性。 The most important are:最重要的是:

  • log.cleaner.min.cleanable.ratio

The minimum ratio of dirty log to total log for a log to eligible for cleaning符合清理条件的日志的脏日志与总日志的最小比率

  • log.cleaner.min.compaction.lag.ms

The minimum time a message will remain uncompacted in the log.消息在日志中保持未压缩的最短时间。 Only applicable for logs that are being compacted.仅适用于正在压缩的日志。

  • log.cleaner.enable

Enable the log cleaner process to run on the server.启用日志清理进程在服务器上运行。 Should be enabled if using any topics with a cleanup.policy=compact including the internal offsets topic.如果使用带有 cleanup.policy=compact 的任何主题,包括内部偏移主题,则应启用。 If disabled those topics will not be compacted and continually grow in size.如果禁用,这些主题将不会被压缩并不断增长。

More detail about compaction you can find https://kafka.apache.org/documentation/#compaction有关压缩的更多详细信息,您可以找到https://kafka.apache.org/documentation/#compaction

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM