简体   繁体   English

如何确保在一个 Kafka 主题中相同的键进入多个表的相同分区

[英]How to ensure that in one Kafka topic same key goes to same partition for multiple tables

I have a requirement to produce data from multiple MongoDB tables and push to the same Kafka Topic using the mongo-kafka connector.我需要从多个 MongoDB 表生成数据并使用 mongo-kafka 连接器推送到同一个 Kafka 主题。 Also I have to ensure that the data for the same table key columns always go to the same partition every time to ensure message ordering.此外,我还必须确保同一表键列的数据始终 go 每次都到同一分区,以确保消息排序。 For example:例如:

tables --> customer , address

table key columns -->CustomerID(for table customer) ,AddressID(for table address)

For CustomerID =12345 , it will always go to partition 1

For AddressID = 54321 , it will always go to partition 2

For a single table, the second requirement is easy to achieve using chained transformations.对于单个表,第二个要求很容易使用链式转换来实现。 However for multiple tables->1 topic, finding it difficult to achieve since each of these tables has different key column names.但是,对于多个表->1 主题,发现很难实现,因为这些表中的每一个都有不同的键列名称。

Is there any way available to fulfil both requirements using the Kafka connector?有没有什么方法可以使用 Kafka 连接器来满足这两个要求?

If you use ExtractField$Key transform and IntegerConverter , all matching IDs should go to the same partition.如果您使用ExtractField$Key转换和IntegerConverter ,所有匹配的 ID 都应该是 go 到同一个分区。

If you have two columns and one table, or end up with keys like {"CustomerID": 12345} then you have a composite/object key, meaning the whole key will be hashed when used to compute partitioning, not the ID itself.如果您有两列和一个表,或者以{"CustomerID": 12345}类的键结束,那么您就有了一个复合/对象键,这意味着在用于计算分区时将散列整个键,而不是 ID 本身。

You cannot set partition for specific fields within any record without setting producer.override.partitioner.class in Connector config.如果不在连接器配置中设置producer.override.partitioner.class ,则无法为任何记录中的特定字段设置分区。 In other words, you need to implement a partitioner that will deserialize your data, parse the values, then compute and return the respective partition.换句话说,您需要实现一个分区器来反序列化您的数据,解析值,然后计算并返回相应的分区。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 多个 oracle 表的单个 kafka 主题 - Single kafka topic for multiple oracle tables 当两个表具有相同的键名时如何使用查找? - How to use lookup when both the tables have the same key names? 多个 collections mongodb 到 Kafka 主题 - multiple collections mongodb to Kafka topic 如何在 MongoDB 中查找分配给多个值的相同键值的记录 - How to find the records with same key value assigned to multiple values in MongoDB 同一类型的多个表的数据库设计 - Database Design for multiple tables of same type 如何根据 kafka 主题名称或消息键/值使用 mongodb 接收器连接器对不同 dbs 和 collections 中的 kafka 主题进行分组 - How to group kafka topics in different dbs and collections with mongodb sink connector depending on kafka topic name or message key/value MongoDB:- 将具有相同密钥的 object 合并为一个 - MongoDB : - Merge object with same key into one 打字稿:将多个对象动态推送到同一个键 - Typescript : push multiple objects to same key dynamically 如何合并同一个 MongoDB 集合中的两个文档并且只有一个公共密钥? - How to merge two documents in the same MongoDB collection and have only one common key? 有没有办法将 MongoSourceConnector 用于具有单个 kafka 主题的多个数据库? - Is there any way to use MongoSourceConnector for multiple database with single kafka topic?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM