简体   繁体   English

如何在Kafka Connect JDBC Source Connector和多个表中使用Single Message Transforms?

[英]How to use Single Message Transforms with Kafka Connect JDBC Source Connector and multiple tables?

I want to set the mesage key when importing tables with the Kafka Connect Source JDBC Connector. 我想在使用Kafka Connect Source JDBC连接器导入表时设置消息键。

How can Single Message Transforms (SMT) in Kafka Connect/Source be targeted to the right fields when having multiple tables defined to be read from JDBC connector? 当已定义要从JDBC连接器读取的多个表时,如何才能将Kafka Connect / Source中的单个消息转换(SMT)定位到正确的字段? SMTs need a column name which might differ when having multiple tables. SMT需要一个列名,当具有多个表时,列名可能会有所不同。

I don't see a way to filter SMT definitions based on table name or similar. 我看不到根据表名称或类似名称过滤SMT定义的方法。 The code sample below works fine since it is only one table. 下面的代码示例仅工作一张表,因此效果很好。

But what to do if you have different tables, eg User, Order, Product ? 但是,如果您有不同的表,例如用户,订单,产品,该怎么办?

"table.whitelist" : "User"
"transforms":"createKey,extract",
"transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
                "transforms.createKey.fields":"user_id",
"transforms.extract.type":"org.apache.kafka.connect.transforms.ExtractField\$Key",
"transforms.extract.field":"user_id",

When a worker task with that configuration meets a table without that user_id field, it crashes and remains in status FAILED 具有该配置的辅助任务遇到没有该user_id字段的表时,它将崩溃并保持为FAILED状态

org.apache.kafka.connect.errors.ConnectException: 
Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50)
at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:293)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:229)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)\nCaused by: java.lang.NullPointerException
at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:85)
at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)
at org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 11 more

This is plausible since there is no possibility to define by table or target optic, or is it? 这是有可能的,因为无法通过工作台或目标光学器件进行定义,是吗? I would expect a capability to restrict transforms to a given table or topic, eg something like 我希望能够将转换限制到给定的表或主题,例如

transforms.<topic-name>.createKey.type

Am I missing something or is it a Connect restriction? 我是否缺少某些内容或它是Connect限制?

It is not possible to apply SMTs only to specific topics because this is a connector level configuration meaning that it is applied to every processed message. 不能仅将SMT应用于特定主题,因为这是连接器级别的配置,这意味着它将应用于所有已处理的消息。

I would recommend you to create distinct connectors for every topic so that you can apply SMTs only to a subset of the topics. 我建议您为每个主题创建不同的连接器,以便仅将SMT应用于主题的子集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM