简体   繁体   English

Kafka - 未压缩主题与压缩主题

[英]Kafka - uncompacted topics Vs compacted topics

I come across the following two phrases from the book "Mastering Kafka Streams and ksqlDB" and author used two terms, what does they really mean "compacted topics" and "uncompacted topics"我在“Mastering Kafka Streams and ksqlDB”一书中遇到了以下两个短语,作者使用了两个术语,它们的真正含义是“压缩主题”和“未压缩主题”

Does they got anything to with respect to "log compaction"?他们对“日志压缩”有什么帮助吗?

Tables can be thought of as updates to a database.表可以被认为是对数据库的更新。 In this view of the logs, only the current state (either the latest record for a given key or some kind of aggregation) for each key is retained.在此日志视图中,仅保留每个键的当前 state(给定键的最新记录或某种聚合)。 Tables are usually built from compacted topics .表通常由压缩主题构建。

Streams can be thought of as inserts in database parlance.流可以被认为是数据库用语中的插入。 Each distinct record remains in this view of the log.每个不同的记录都保留在此日志视图中。 Streams are usually built from uncompacted topics .流通常由未压缩的主题构建。

Yes, log compaction according to kafka docs是的,根据 kafka 文档进行log compaction

Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition日志压缩确保 Kafka 将始终为单个主题分区的数据日志中的每个消息键至少保留最后一个已知值

https://kafka.apache.org/documentation/#compaction https://kafka.apache.org/documentation/#compaction

If log compaction is enabled on topic, Kafka removes any old records when there is a newer version of it with the same key in the partition log.如果在主题上启用了日志压缩,当分区日志中存在具有相同键的新版本时,Kafka 会删除所有旧记录。

For more detailed explanation of log compaction refer - https://medium.com/swlh/introduction-to-topic-log-compaction-in-apache-kafka-3e4d4afd2262有关日志压缩的更详细说明,请参阅 - https://medium.com/swlh/introduction-to-topic-log-compaction-in-apache-kafka-3e4d4afd2262

Yes, these terms are synonymous.是的,这些术语是同义词。

Ref: Log Compaction参考:日志压缩

From this article :这篇文章

The idea behind compacted topics is that no duplicate keys exist.压缩主题背后的想法是不存在重复的键。 Only the most recent value for a message key is maintained.仅维护消息密钥的最新值。

It is mostly used for the scenarios such as restoring to the previous state before the application crashed or system failed, or while reloading cache after application restarts.多用于应用程序崩溃或系统故障前恢复到之前的state,或应用程序重启后重新加载缓存等场景。

As an example of the above kafka has the topic __consumer_offsets which can be used to to continue from the last message which was read after a crash or a restart.作为上述 kafka 的示例,主题__consumer_offsets可用于从崩溃或重启后读取的最后一条消息继续。 A schema registry is also often used to ensure compatible communication between producers and consumers.模式注册表也经常用于确保生产者和消费者之间的兼容通信。 The schemas used are maintained in the __schemas topic.使用的模式在__schemas主题中维护。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM