使用 Kafka 进行数据建模？主题和分区

Question

One of the first things I think about when using a new service (such as a non-RDBMS data store or a message queue) is: "How should I structure my data?".在使用新服务（例如非 RDBMS 数据存储或消息队列）时，我首先考虑的事情之一是：“我应该如何构建我的数据？”。

I've read and watched some introductory materials.我阅读并观看了一些介绍性材料。 In particular, take, for example, Kafka: a Distributed Messaging System for Log Processing , which writes:特别地，以Kafka: a Distributed Messaging System for Log Processing为例，它写道：

"a Topic is the container with which messages are associated" “主题是与消息关联的容器”
"the smallest unit of parallelism is the partition of a topic. This implies that all messages that ... belong to a particular partition of a topic will be consumed by a consumer in a consumer group." “并行的最小单位是主题的分区。这意味着......属于主题特定分区的所有消息将被消费者组中的消费者消费。”

Knowing this, what would be a good example that illustrates how to use topics and partitions?知道了这一点，什么是说明如何使用主题和分区的好例子？ When should something be a topic?什么时候应该成为话题？ When should something be a partition?什么时候应该是分区？

As an example, let's say my (Clojure) data looks like:例如，假设我的 (Clojure) 数据如下所示：

{:user-id 101 :viewed "/page1.html" :at #inst "2013-04-12T23:20:50.22Z"}
{:user-id 102 :viewed "/page2.html" :at #inst "2013-04-12T23:20:55.50Z"}

Should the topic be based on user-id ?主题应该基于user-id吗？ viewed ? viewed吗？ at ? at ？ What about the partition?分区呢？

How do I decide?我如何决定？

Answer 1

When structuring your data for Kafka it really depends on how it´s meant to be consumed.在为 Kafka 构建数据时，它实际上取决于它的使用方式。

In my mind, a topic is a grouping of messages of a similar type that will be consumed by the same type of consumer so in the example above, I would just have a single topic and if you´ll decide to push some other kind of data through Kafka, you can add a new topic for that later.在我看来，主题是一组相似类型的消息，将由相同类型的消费者使用，因此在上面的示例中，我只有一个主题，如果您决定推送其他类型的消息通过 Kafka 获取数据，您可以稍后为其添加新主题。

Topics are registered in ZooKeeper which means that you might run into issues if trying to add too many of them, eg the case where you have a million users and have decided to create a topic per user.主题在 ZooKeeper 中注册，这意味着如果尝试添加太多主题，您可能会遇到问题，例如，您拥有一百万用户并决定为每个用户创建一个主题。

Partitions on the other hand is a way to parallelize the consumption of the messages.另一方面，分区是一种并行化消息消费的方法。 The total number of partitions in a broker cluster need to be at least the same as the number of consumers in a consumer group to make sense of the partitioning feature.代理集群中的分区总数至少需要与消费者组中的消费者数量相同才能理解分区功能。 Consumers in a consumer group will split the burden of processing the topic between themselves according to the partitioning so that one consumer will only be concerned with messages in the partition itself is "assigned to".消费者组中的消费者将根据分区在他们之间分担处理主题的负担，这样一个消费者将只关心分区本身被“分配给”的消息。

Partitioning can either be explicitly set using a partition key on the producer side or if not provided, a random partition will be selected for every message.可以使用生产者端的分区键显式设置分区，或者如果未提供，将为每条消息选择一个随机分区。

Answer 2

Once you know how to partition your event stream, the topic name will be easy, so let's answer that question first.一旦您知道如何对事件流进行分区，主题名称就会很简单，所以让我们先回答这个问题。

@Ludd is correct - the partition structure you choose will depend largely on how you want to process the event stream. @Ludd 是正确的 - 您选择的分区结构在很大程度上取决于您希望如何处理事件流。 Ideally you want a partition key which means that your event processing is partition-local .理想情况下，您需要一个分区键，这意味着您的事件处理是partition-local 。

For example:例如：

If you care about users' average time-on-site, then you should partition by :user-id .如果您关心用户的平均现场时间，那么您应该按:user-id分区。 That way, all the events related to a single user's site activity will be available within the same partition.这样，与单个用户的站点活动相关的所有事件都将在同一分区中可用。 This means that a stream processing engine such as Apache Samza can calculate average time-on-site for a given user just by looking at the events in a single partition.这意味着像Apache Samza这样的流处理引擎可以通过查看单个分区中的事件来计算给定用户的平均现场时间。 This avoids having to perform any kind of costly partition-global processing这避免了必须执行任何类型的昂贵的分区全局处理
If you care about the most popular pages on your website, you should partition by the :viewed page.如果您关心网站上最受欢迎的页面，则应按:viewed页面进行分区。 Again, Samza will be able to keep a count of a given page's views just by looking at the events in a single partition同样，Samza 将能够仅通过查看单个分区中的事件来记录给定页面的查看次数

Generally, we are trying to avoid having to rely on global state (such as keeping counts in a remote database like DynamoDB or Cassandra), and instead be able to work using partition-local state.通常，我们试图避免依赖全局状态（例如在 DynamoDB 或 Cassandra 等远程数据库中保持计数），而是能够使用分区本地状态工作。 This is because local state is a fundamental primitive in stream processing .这是因为本地状态是流处理中的基本原语。

If you need both of the above use-cases, then a common pattern with Kafka is to first partition by say :user-id , and then to re-partition by :viewed ready for the next phase of processing.如果您需要上述两个用例，那么 Kafka 的一个常见模式是首先通过:user-id进行分区，然后通过:viewed重新分区，为下一阶段的处理做好准备。

On topic names - an obvious one here would be events or user-events .关于主题名称 - 这里很明显的一个是events或user-events 。 To be more specific you could go with with events-by-user-id and/or events-by-viewed .更具体地说，您可以使用events-by-user-id和/或events-by-viewed 。

Answer 3

This is not exactly related to the question, but in case you already have decided upon the logical segregation of records based on topics, and want to optimize the topic/partition count in Kafka, this blog post might come handy.这与问题并不完全相关，但如果您已经决定基于主题对记录进行逻辑隔离，并且想要优化 Kafka 中的主题/分区计数，那么这篇博文可能会派上用场。

Key takeaways in a nutshell:简而言之，关键要点：

In general, the more partitions there are in a Kafka cluster, the higher the throughput one can achieve.一般来说，Kafka 集群中的分区越多，可以实现的吞吐量就越高。 Let the max throughout achievable on a single partition for production be p and consumption be c .让生产的单个分区上可实现的最大吞吐量为p ，消费为c 。 Let's say your target throughput is t .假设您的目标吞吐量是t 。 Then you need to have at least max( t / p , t / c ) partitions.那么你至少需要有 max( t / p , t / c ) 个分区。
Currently, in Kafka, each broker opens a file handle of both the index and the data file of every log segment.目前，在 Kafka 中，每个 broker 都会打开每个日志段的索引和数据文件的文件句柄。 So, the more partitions, the higher that one needs to configure the open file handle limit in the underlying operating system.因此，分区越多，需要在底层操作系统中配置打开文件句柄限制就越高。 Eg in our production system, we once saw an error saying too many files are open , while we had around 3600 topic partitions.例如，在我们的生产系统中，我们曾经看到过一个错误，说too many files are open ，而我们有大约 3600 个主题分区。
When a broker is shut down uncleanly (eg, kill -9), the observed unavailability could be proportional to the number of partitions.当一个代理被不干净地关闭（例如，kill -9）时，观察到的不可用性可能与分区数成正比。
The end-to-end latency in Kafka is defined by the time from when a message is published by the producer to when the message is read by the consumer. Kafka 中的端到端延迟定义为从生产者发布消息到消费者读取消息的时间。 As a rule of thumb, if you care about latency, it's probably a good idea to limit the number of partitions per broker to 100 x b x r , where b is the number of brokers in a Kafka cluster and r is the replication factor.根据经验，如果您关心延迟，将每个代理的分区数量限制为 100 x b x r可能是个好主意，其中b是 Kafka 集群中的代理数量， r是复制因子。

Answer 4

I think topic name is a conclusion of a kind of messages, and producer publish message to the topic and consumer subscribe message through subscribe topic.我认为主题名称是一种消息的结论，生产者向主题发布消息，消费者通过订阅主题订阅消息。

A topic could have many partitions.一个主题可以有多个分区。 partition is good for parallelism.分区有利于并行。 partition is also the unit of replication,so in Kafka, leader and follower is also said at the level of partition. partition也是复制的单位，所以在Kafka中，leader和follower也是在partition这个层面上说的。 Actually a partition is an ordered queue which the order is the message arrived order.实际上，分区是一个有序队列，其顺序是消息到达顺序。 And the topic is composed by one or more queue in a simple word.主题由一个或多个队列组成一个简单的词。 This is useful for us to model our structure.这对我们为我们的结构建模很有用。

Kafka is developed by LinkedIn for log aggregation and delivery. Kafka 由 LinkedIn 开发，用于日志聚合和交付。 this scene is very good as a example.这个场景非常好作为例子。

The user's events on your web or app can be logged by your Web sever and then sent to Kafka broker through the producer.用户在您的 Web 或应用程序上的事件可以由您的 Web 服务器记录，然后通过生产者发送到 Kafka 代理。 In producer, you could specific the partition method, for example : event type (different event is saved in different partition) or event time (partition a day into different period according your app logic) or user type or just no logic and balance all logs into many partitions.在生产者中，您可以指定分区方法，例如：事件类型（不同的事件保存在不同的分区中）或事件时间（根据您的应用程序逻辑将一天分成不同的时间段）或用户类型或没有逻辑并平衡所有日志分成许多分区。

About your case in question, you can create one topic called "page-view-event", and create N partitions through hash keys to distribute the logs into all partitions evenly.关于你的案例，你可以创建一个名为“page-view-event”的topic，通过hash键创建N个分区，将日志平均分配到所有分区。 Or you could choose a partition logic to make log distributing by your spirit.或者你可以选择一个分区逻辑，让日志按照你的精神进行分发。

使用 Kafka 进行数据建模？主题和分区

问题描述

4 个解决方案

解决方案1
141 已采纳 2013-06-20 13:57:03

解决方案2
64 2015-02-24 17:11:39

解决方案3
8 2018-03-05 08:07:20

解决方案4
5 2017-07-21 03:34:45

使用 Kafka 进行数据建模？ 主题和分区

问题描述

4 个解决方案

解决方案1 141 已采纳 2013-06-20 13:57:03

解决方案2 64 2015-02-24 17:11:39

解决方案3 8 2018-03-05 08:07:20

解决方案4 5 2017-07-21 03:34:45

使用 Kafka 进行数据建模？主题和分区

解决方案1
141 已采纳 2013-06-20 13:57:03

解决方案2
64 2015-02-24 17:11:39

解决方案3
8 2018-03-05 08:07:20

解决方案4
5 2017-07-21 03:34:45