简体   繁体   中英

Kafka Connect cannot cast custom storage sink partitioner to Partitioner interface

I need to create a custom partitioner for the kafka connect S3 sink plugin . I've extended the HourlyPartitioner in a custom class using kotlin:

class RawDumpHourlyPartitioner<T> : HourlyPartitioner<T>() {
...
}

and changed my connector config accordingly to use the custom class:

"partitioner.class": "co.myapp.RawDumpHourlyPartitioner",

I've then created our jar (we use shadow ) and included it into a custom docker image based on the kafka connect image (the image version is the same as the dependencies we use in the project):

FROM gradle:6.0-jdk8 as builder
WORKDIR /app
ADD . .
RUN gradle clean shadowJar

FROM confluentinc/cp-kafka-connect:5.3.2

COPY --from=builder /app/build/libs/kafka-processor-0.1-all.jar /usr/share/java/kafka/kafka-processor.jar

When the connector starts I get this error:

ERROR WorkerSinkTask{id=staging-raw-dump-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
java.lang.ClassCastException: co.myapp.RawDumpHourlyPartitioner cannot be cast to io.confluent.connect.storage.partitioner.Partitioner

To double check I've created a java file that tries to instantiate the class and it didn't throw any error:

import io.confluent.connect.storage.partitioner.Partitioner;

public class InstantiateTest {
    public static void main(String[] args) throws ClassNotFoundException, IllegalAccessException, InstantiationException {
        Class<? extends Partitioner<?>> partitionerClass =
                (Class<? extends Partitioner<?>>) Class.forName("co.myapp.RawDumpHourlyPartitioner");

        Partitioner<?> partitioner = partitionerClass.newInstance();
    }
}

Looking at the kafka connect guide it says:

A Kafka Connect plugin is simply a set of JAR files where Kafka Connect can find an implementation of one or more connectors, transforms, and/or converters. Kafka Connect isolates each plugin from one another so that libraries in one plugin are not affected by the libraries in any other plugins. This is very important when mixing and matching connectors from multiple providers.

This means that since I'm using the S3 sink connector, I have to put my jar with the custom partitioner in the directory of the S3 plugin.

Moving the jar file to /usr/share/java/kafka-connect-s3 solved the issue

In the comments I've mentioned that my jar also includes a custom subject name strategy that we use in the main kafka-connect config (the env variables), in that case the jar needs to be in the /usr/share/java/kafka folder

Update : as cricket_007 mentioned it's better to put the custom partitioner jar into the /usr/share/java/kafka-connect-storage-common folder which is where all the other partitioners are

Depending on which Sink you use, We need to push partitioner class there, as in our case when we were using Confluent Kafka 5.5 , and connector class Azure Gen2 Storage.

For that we need to write custom partitioner similar to following Repo in Github .

Then We place the custom JAR in following path:

/usr/share/confluent-hub-components/confluentinc-kafka-connect-azure-data-lake-gen2-storage/lib/ 

After which our connector class working successfully!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM