简体   繁体   English

kafka FileStreamSourceConnector 将 avro 文件写入带有关键字段的主题

[英]kafka FileStreamSourceConnector write an avro file to topic with key field

I want to use kafka FileStreamSourceConnector to write a local avro file into a topic.我想使用 kafka FileStreamSourceConnector将本地 avro 文件写入主题。

My connector config looks like this:我的连接器配置如下所示:

curl -i -X PUT -H  "Content-Type:application/json" http://localhost:8083/connectors/file_source_connector/config \
            -d '{
            "connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
            "value.converter.schema.registry.url": "http://schema-registry:8081",
            "topic": "my_topic",
            "file": "/data/log.avsc",
            "format.include.keys": "true",
            "source.auto.offset.reset": "earliest",
            "tasks.max": "1",
            "value.converter.schemas.enable": "true",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter": "org.apache.kafka.connect.storage.StringConverter"
          }'

Then when I print out the topic, the key fields are null .然后当我打印出主题时,关键字段是null

Updated on 2021-03-29 :更新于 2021-03-29

After watching this video Twelve Days of SMT - Day 2: ValueToKey and ExtractField from Robin, I applied SMT to my connector config:观看此视频SMT 十二天 - 第 2 天:Robin 的 ValueToKey 和 ExtractField后,我将 SMT 应用于我的连接器配置:

curl -i -X PUT -H  "Content-Type:application/json" http://localhost:8083/connectors/file_source_connector_02/config \
            -d '{
            "connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
            "value.converter.schema.registry.url": "http://schema-registry:8081",
            "topic": "my_topic",
            "file": "/data/log.avsc",
            "tasks.max": "1",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter": "org.apache.kafka.connect.storage.StringConverter",
            "transforms": "ValueToKey, ExtractField",
            "transforms.ValueToKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
            "transforms.ValueToKey.fields":"id",
            "transforms.ExtractField.type":"org.apache.kafka.connect.transforms.ExtractField$Key",
            "transforms.ExtractField.field":"id"
          }'

However, the connector is failed:但是,连接器失败:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String

I would use ValueToKey transformer.我会使用 ValueToKey 转换器。 In bad case ignorig values and setting random key.在不好的情况下忽略值并设置随机键。

For details look at: ValueToKey详情请看: ValueToKey

FileStreamSource assumes UTF8 encoded, line delimited files are your input, not binary files such as Avro. FileStreamSource 假定 UTF8 编码的行分隔文件是您的输入,而不是像 Avro 这样的二进制文件。 Last I checked, format.include.keys is not a valid config for the connector either.最后我检查了一下, format.include.keys也不是连接器的有效配置。

Therefore each consumed event will be a string, and subsequently, transforms that require Structs with field names will not work因此,每个消耗的事件都将是一个字符串,随后,需要具有字段名称的结构的转换将不起作用

You can use the Hoist transform to create a Struct from each "line", but this still will not parse your data to make the ID field accessible to move to the key.您可以使用 Hoist 转换从每个“行”创建一个 Struct,但这仍然不会解析您的数据以使 ID 字段可访问以移动到键。

Also, your file is AVSC, which is JSON formatted, not Avro, so I'm not sure what the goal is by using the AvroConverter, or having "schemas.enable": "true" .另外,您的文件是 AVSC,它的格式是 JSON,而不是 Avro,所以我不确定使用 AvroConverter 或使用"schemas.enable": "true"的目标是什么。 Still, the lines read by the connector are not parsed by converters such that fields are accessible, only serialized when sent to Kafka尽管如此,连接器读取的行不会被转换器解析,因此字段是可访问的,只有在发送到 Kafka 时才会序列化


My suggestion would be to write some other CLI script using plain producer libraries to parse the file, extract the schema, register that with Schema Registry, build a producer record for each entity in the file, and send them我的建议是使用普通的生产者库编写一些其他 CLI 脚本来解析文件,提取模式,将其注册到模式注册表,为文件中的每个实体构建生产者记录,然后发送它们

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM