简体   繁体   English

设计 Kafka 主题 - 许多主题与一个大主题

[英]Desigining Kafka Topics - Many Topics vs One Big Topic

Considering a stream of different events the recommended way would be考虑到一系列不同的事件,推荐的方法是

  • one big topic containing all events一个包含所有事件的大主题
  • multiple topics for different types of events不同类型事件的多个主题

Which option would be better?哪个选项会更好?

I understand that messages not being in the same partition of a topic it means there are no order guarantee , but are there any other factors to be considered when making this decision?我知道消息不在主题的同一分区中,这意味着没有顺序保证,但在做出此决定时是否还有其他因素需要考虑

A topic is a logical abstraction and should contain message of the same type.主题是逻辑抽象,应包含相同类型的消息。 Let's say, you monitor a website and capture click stream events and on the other hand you have a database that populates it's changes into a changelog topics.假设您监控网站并捕获点击流事件,另一方面,您有一个数据库将其更改填充到更改日志主题中。 You should have two different topics because click stream events are not related to you database changelog.您应该有两个不同的主题,因为点击流事件与您的数据库更改日志无关。

This has multiple advantages:这有多个优点:

  • your data will have different format und you will need different (de)serializers to write read the data (using a single topic you would need a hybrid serializer and you will not get type safety when reading data)您的数据将具有不同的格式,并且您将需要不同的(反)序列化程序来写入读取数据(使用单个主题您将需要一个混合序列化程序,并且在读取数据时您将无法获得类型安全)
  • you will have different consumer application and one application might be interested in click stream events only, while a second application is only interested in the database changelog and a third application is interested in both.您将有不同的消费者应用程序,一个应用程序可能只对点击流事件感兴趣,而第二个应用程序只对数据库更改日志感兴趣,而第三个应用程序对两者都感兴趣。 If you have multiple topics, application one and two only subscribe to the topics they are interesting in -- if you have a single topic, application one an two need to read everything and filter the stuff they are not interested in increasing broker, network, can client load如果您有多个主题,应用程序一和应用程序二只订阅他们感兴趣的主题——如果您只有一个主题,应用程序一和应用程序二需要阅读所有内容并过滤他们不感兴趣的内容,增加经纪人,网络,客户端可以加载吗

As @Matthias J. Sax told before there is not a golden bullet over here.正如@Matthias J. Sax 之前所说,这里没有金子弹。 But we have to take different topics into account.但我们必须考虑不同的主题。

The conditioner: ordered deliveries空调:订购交货

If you application needs guarantee order delivery, you need to work with only one topic, plus same keys for those messages which need to guarantee it.如果您的应用程序需要保证订单交付,您只需要处理一个主题,以及需要保证它的那些消息的相同密钥。

If ordering is not mandatory, the game starts...如果订购不是强制性的,游戏开始......

Does the schema same for all messages?所有消息的模式是否相同?

Would be consumers interested in the same type of different events?消费者是否会对相同类型的不同事件感兴趣?

What is gonna happen at the consumer side?, do we are reducing or increasing complexity in terms of implementation, maintainability, error handling...?消费者方面会发生什么?,我们是在减少还是增加在实现、可维护性、错误处理方面的复杂性……?

Does horizontal scalability important for us?水平可扩展性对我们重要吗? More topics often means more partitions available, which means more horizontal scalability capacity.更多的主题通常意味着更多的可用分区,这意味着更多的水平扩展能力。 Also it allows more accurate scalability configuration at the broker side, because we can choose what number of partitions to increase per event type.它还允许在代理端进行更准确的可扩展性配置,因为我们可以选择每种事件类型增加的分区数量。 or at the consumer side, what number of consumers stand up per event type.或者在消费者方面,每个事件类型有多少消费者站起来。

Does makes sense parallelising consumption per message type?对每种消息类型并行消费有意义吗? ... ...

Technically speaking, if we allow consumers to fine tune those type of events to be consumed we're potentially reducing the network bandwidth required to send undesired messages from the broker to the consumer, plus the number deserialisations for all of them (cpu used, which makes along time more free resources, energy cost reduction...).从技术上讲,如果我们允许消费者微调要消费的那些类型的事件,我们就有可能减少从代理向消费者发送不需要的消息所需的网络带宽,以及所有这些事件的反序列​​化数量(使用的 CPU,随着时间的推移,更多的免费资源,能源成本降低......)。

Also is worthy to remember that splitting different type of messages in different topics doesn't mean have to consume them with different Kafka consumers because they allow consumption from different topics at the same time.同样值得记住的是,在不同的主题中拆分不同类型的消息并不意味着必须使用不同的 Kafka 消费者来消费它们,因为它们允许同时从不同的主题消费。

Well, there's not a clear answer for this question, but I have the feeling that with Kafka, because multiple features, if ordered deliveries are not needed we should split our messages per type in different topics.好吧,这个问题没有明确的答案,但我有一种感觉,对于 Kafka,因为具有多种功能,如果不需要有序交付,我们应该将每种类型的消息拆分为不同的主题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM