简体   繁体   English

Flink Kafka EXACTLY_ONCE 导致 KafkaException ByteArraySerializer 不是 Serializer 的实例

[英]Flink Kafka EXACTLY_ONCE causing KafkaException ByteArraySerializer is not an instance of Serializer

So, I'm trying to enable EXACTLY_ONCE semantic in my Flink Kafka streaming job along with checkpointing.所以,我试图在我的 Flink Kafka 流作业中启用 EXACTLY_ONCE 语义以及检查点。

However I am not getting it to work, so I tried downloading the test sample code from Github: https://github.com/apache/flink/blob/c025407e8a11dff344b587324ed73bdba2024dff/flink-end-to-end-tests/flink-streaming-kafka-test/src/main/java/org/apache/flink/streaming/kafka/test/KafkaExample.java However I am not getting it to work, so I tried downloading the test sample code from Github: https://github.com/apache/flink/blob/c025407e8a11dff344b587324ed73bdba2024dff/flink-end-to-end-tests/flink-streaming- kafka-test/src/main/java/org/apache/flink/streaming/kafka/test/KafkaExample.java

So running this works fine.所以运行这个工作正常。 However, when enabling checkpointing I get errors.但是,启用检查点时出现错误。 Or if I change EXACTLY_ONCE to AT_LEAST_ONCE semantics and enable checkpointing, it works fine.或者,如果我将 EXACTLY_ONCE 更改为 AT_LEAST_ONCE 语义并启用检查点,它工作正常。 But then when changing it to EXACTLY_ONCE, I get this error again.但是当将其更改为 EXACTLY_ONCE 时,我再次收到此错误。

The exception I'm getting:我得到的例外:

org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to construct kafka producer
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:593)
    at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
    at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
    at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:650)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.abortTransactions(FlinkKafkaProducer.java:1099)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.initializeState(FlinkKafkaProducer.java:1036)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:284)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka producer
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:430)
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:298)
    at org.apache.flink.streaming.connectors.kafka.internal.FlinkKafkaInternalProducer.<init>(FlinkKafkaInternalProducer.java:76)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.lambda$abortTransactions$2(FlinkKafkaProducer.java:1107)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
    at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
    at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer
    at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:304)
    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:360)
    ... 12 more

I've made slight changes to the code to work in my enviroment.我对代码进行了细微的更改,以便在我的环境中工作。 I'm running it inside the flink operations playground inside docker.我在 docker 内的 flink 操作操场内运行它。 (This https://ci.apache.org/projects/flink/flink-docs-stable/getting-started/docker-playgrounds/flink-operations-playground.html ). (此https://ci.apache.org/projects/flink/flink-docs-stable/getting-started/docker-playgrounds/flink-operations-playground.ZFC35FDC70D5FC69D263EZA )。 Latest version, 1.10 and the provided kafka inside that is verison 2.2.1最新版本,1.10 和里面提供的 kafka 版本是 2.2.1

    public static void main(String[] args) throws Exception {

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        env.enableCheckpointing(1_000);

        String inputTopic = "my-input";
        String outputTopic = "my-output";
        String kafkaHost = "kafka:9092";

        Properties kafkaProps = new Properties();
        kafkaProps.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaHost);
        kafkaProps.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "test");


        DataStream<KafkaEvent> input = env
                .addSource(new FlinkKafkaConsumer<>(inputTopic, new KafkaEventSchema(), kafkaProps)
                        .assignTimestampsAndWatermarks(new CustomWatermarkExtractor()))
                .keyBy("word")
                .map(new RollingAdditionMapper());

        input.addSink(
                new FlinkKafkaProducer<>(
                        outputTopic,
                        new KafkaEventSerializationSchema(outputTopic),
                        kafkaProps,
                        FlinkKafkaProducer.Semantic.EXACTLY_ONCE));

        env.execute("Modern Kafka Example");
    }

Other classes from the example can be found: https://github.com/apache/flink/tree/c025407e8a11dff344b587324ed73bdba2024dff/flink-end-to-end-tests/flink-streaming-kafka-test-base/src/main/java/org/apache/flink/streaming/kafka/test/base可以找到示例中的其他类: https://github.com/apache/flink/tree/c025407e8a11dff344b587324ed73bdba2024dff/flink-end-to-end-tests/flink-streaming-kafka-test-base/src/main/java /org/apache/flink/streaming/kafka/test/base

I did try to change the serialization to use KafkaSerializationSchema rather than the example code which use SerializationSchema.我确实尝试将序列化更改为使用 KafkaSerializationSchema 而不是使用 SerializationSchema 的示例代码。 However that code, below did not help either.但是,下面的代码也没有帮助。 Same error.同样的错误。

import org.apache.flink.streaming.connectors.kafka.KafkaSerializationSchema;
import org.apache.kafka.clients.producer.ProducerRecord;

public class KafkaEventSerializationSchema implements KafkaSerializationSchema<KafkaEvent> {

    /**
     * 
     */
    private static final long serialVersionUID = 1L;

    private String topic;

    public KafkaEventSerializationSchema(String topic) {
        this.topic = topic;
    }

    @Override
    public ProducerRecord<byte[], byte[]> serialize(KafkaEvent element, Long timestamp) {
        return new ProducerRecord<>(topic, element.toString().getBytes());
    }


}

All help appreciated.所有帮助表示赞赏。 I haven't been able to find any online working code of EXACTLY_ONCE garantuee between flink and kafka.我一直无法在 flink 和 kafka 之间找到任何 EXACTLY_ONCE garantuee 的在线工作代码。 Only loads articles talking about it but not actual substancial working code.只加载谈论它的文章,但不加载实际的实质性工作代码。 That's all I'm trying to achieve here.这就是我想要在这里实现的。

I ran into the same problem and explicitly setting a timeout for the producer helped.我遇到了同样的问题,并明确为生产者设置了超时帮助。 properties.setProperty("transaction.timeout.ms", "900000");

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 org.apache.kafka.common.KafkaException:SaleRequestFactory 类不是 org.apache.kafka.common.serialization.Serializer 的实例 - org.apache.kafka.common.KafkaException: class SaleRequestFactory is not an instance of org.apache.kafka.common.serialization.Serializer 当 PROCESSING_GUARANTEE_CONFIG 设置为 EXACTLY_ONCE 时,Kafka 无法重新平衡 - Kafka Failed to rebalance when PROCESSING_GUARANTEE_CONFIG set to EXACTLY_ONCE Kafka Streams 的 processing.guarantee 设置为 EXACTLY_ONCE 问题 - Kafka Streams with processing.guarantee set up to EXACTLY_ONCE issue Kafka中的LongSerializer vs ByteArraySerializer - LongSerializer vs ByteArraySerializer in kafka KafkaException:class 不是 org.apache.kafka.common.serialization.Deserializer 的实例 - KafkaException: class is not an instance of org.apache.kafka.common.serialization.Deserializer 发现异常.....org.apache.kafka.common.KafkaException:无法使用自定义对象序列化程序构建 kafka 生产者 - Exception found.....org.apache.kafka.common.KafkaException: Failed to construct kafka producer using custom object Serializer 在 Kafka 中通过重放进行精确一次处理 - Exactly-once processing in Kafka with replay Kafka的Spring Boot只需一次处理 - Spring boot with Kafka with Exactly Once Processing 如何使用 Apache Kafka 实现“Exactly once” kafka 消费者? - How to implement “Exactly once” kafka consumer using Apache Kafka? Flink (1.3.2) 只向每个算子广播记录一次 - Flink (1.3.2) Broadcast record to every operator exactly once
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM