简体繁体中英

Kafka Connect, Cassandra Sink: How to specify the partition and clustering keys?

原文 2020-04-04 13:14:12 2 1 apache-kafka/ cassandra/ apache-kafka-connect

I went through the Cassandra Sink doc but I don't see how to specify the partition and clustering keys.

The doc says this:

You can configure this connector to manage the schema on the Cassandra cluster. When altering an existing table the key is ignored. This is to avoid the potential issues around changing a primary key on an existing table. The key schema is used to generate a primary key for the table when it is created.

If it is a new table, the Connector will use the Key schema (from the KStream I suppose) to create the primary key. That might be Ok for the Partition Key, but not for the Clustering key.

So are we forced to create all the tables with the right keys before running the Streaming app, or is there a way to adjust things?

1 answers

Confluent's connector requires that all columns that are in the primary key should be in the key of the topic (as struct, if I remember correctly). This is one of the its limitations, as it may not be matching your output from application. In this case you'll need to transform topic to match this requirement.

Instead of Confluent's connector, I recommend to take DataStax's Kafka Connector that is carefully designed to effective load of data into Cassandra/DSE. It has following features (more information is in the following blog post ):

Store data from one topic into one or multiple Cassandra tables (to support data denormalization);
Mapping of data in topic into Cassandra columns is defined by configuration file, so you can take any piece of key or value of the message, and map into column;
very effective by using unlogged batches where possible & lightweight;
support different security features of Cassandra/DSE;

Connector is free to use for DSE starting with DSE 4.8, and Cassandra starting with 2.1.

Kafka Connect Sink Partition by recordField which is in Ticks

Partition By Multiple Nested Fields in Kafka Connect HDFS Sink

kafka-connect : Getting error in distributed configuration for connector sink cassandra

how to design Kafka Connect that is Sink as well as Source

How to activate and configure ElasticSearch Kafka Connect sink?

Offset and Partition - Kafka Sink Processor

Kafka sink Error “This connector requires that records from Kafka contain the keys for the Cassandra table”

Kafka S3 Sink Connector - how to mark a partition as complete

Kafka Connect Hbase sink

ClassCastException in kafka Cassandra sink connector

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Kafka Connect Sink Partition by recordField which is in Ticks Partition By Multiple Nested Fields in Kafka Connect HDFS Sink kafka-connect : Getting error in distributed configuration for connector sink cassandra how to design Kafka Connect that is Sink as well as Source How to activate and configure ElasticSearch Kafka Connect sink? Offset and Partition - Kafka Sink Processor Kafka sink Error “This connector requires that records from Kafka contain the keys for the Cassandra table” Kafka S3 Sink Connector - how to mark a partition as complete Kafka Connect Hbase sink ClassCastException in kafka Cassandra sink connector

Related Tags

Kafka Connect, Cassandra Sink: How to specify the partition and clustering keys?

Question

1 answers

solution1 1 ACCPTED 2020-04-04 14:11:35

solution1
1 ACCPTED 2020-04-04 14:11:35