简体   繁体   中英

Kafka sink Error “This connector requires that records from Kafka contain the keys for the Cassandra table”

I am tring to sync all tables read from Sap into cassandra using kafka here is my cassandra config

{
    "name": "cassandra",
    "config": {
        "connector.class": "io.confluent.connect.cassandra.CassandraSinkConnector",
        "tasks.max": "5",
        "topics" :"sap_table1,sap_table2",
        "cassandra.keyspace": "sap",
        "cassandra.compression":"SNAPPY",
        "cassandra.consistency.level":"LOCAL_QUORUM",
        "cassandra.write.mode":"Update",
        "transforms":"prune", 
       "transforms.prune.type":"org.apache.kafka.connect.transforms.ReplaceField$Value",
        "transforms.prune.whitelist":"CreatedAt,Id,Text,Source,Truncated",
        "transforms.ValueToKey.fields":"ROWTIME"

    }
}

I am getting this error

Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask:584) org.apache.kafka.connect.errors.DataException: Record with a null key was encountered.  This connector requires that records from Kafka contain the keys for the Cassandra table. Please use a transformation like org.apache.kafka.connect.transforms.ValueToKey to create a key with the proper fields.

All tables generated from kafka sap connectior are without a key i dunno if this is the issue

let me know if i am doing anything wring

thanks

"ROWTIME" only exists as a KSQL concept. It's not actually a field within your value, so therefore the key is being set to null.

Also, ValueToKey isn't listed in the transforms list, so that's not even being applied. You'll have to add "transforms.ValueToKey.type" as well.

You'll have to use a different transform method to set the record timestamp as the ConnectRecord message key

that error mean your data is not serialized so it is not in the json format or dictionary format {'key':'value'}. if you read yoiur data directly from broker as a troubleshooting way you will find your data have only the values without any keys:

use this command to read your data from broker:

/bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic your_topic_name--from-beginning

so the best way to solve this issue is to add serializer into your publisher configuration file. try this file as a source connector or publisher

name=src-view
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
topic.prefix=test-
connection.url=jdbc:postgresql://127.0.0.1:5434/test?user=testuser&password=testpass
mode=incrementing
incrementing.column.name=id
table.types=table
table.whitelist=table_name
validate.non.null=false
batch.max.rows=10000
bootstrap.servers=localhost:9092

key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schema.registry.url=http://localhost:8081

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

and below is the consumer (sink.conf) to de-serialize your data:

name=cas-dest
connector.class=io.confluent.connect.cassandra.CassandraSinkConnector
tasks.max=1
topics=your_topic_name
cassandra.contact.points=127.0.0.1
cassandra.port=9042
cassandra.keyspace=your_keyspace_name
cassandra.write.mode=Update
cassandra.keyspace.create.enabled=true
cassandra.table.manage.enabled=true
key.converter.schema.registry.url=http://localhost:8081
value.converter.schema.registry.url=http://localhost:8081
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schema.registry.url=http://localhost:8081
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
transforms=createKey
transforms.createKey.fields=id,timestamp
transforms.createKey.type=org.apache.kafka.connect.transforms.ValueToKey

change the createKey.fields as per your data and be careful as it will be your partition keys so read about data modeling in cassandra before choosing your keys and it should be exist in your data key.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM