简体   繁体   English

Spring Cloud Kafka:当两个处理器处于活动状态时,无法序列化 output stream 的数据

[英]Spring Cloud Kafka: Can't serialize data for output stream when two processors are active

I have a working setup for Spring Cloud Kafka Streams with functional programming style.我有一个具有函数式编程风格的 Spring Cloud Kafka Streams 的工作设置。 There are two use cases, which are configured via application.properties .有两个用例,通过application.properties配置。 Both of them work individually, but as soon as I activate both at the same time, I get a serialization error for the output stream of the second use case:它们都单独工作,但是一旦我同时激活它们,我就会收到第二个用例的 output stream 的序列化错误:

Exception in thread "ActivitiesAppId-05296224-5ea1-412a-aee4-1165870b5c75-StreamThread-1" org.apache.kafka.streams.errors.StreamsException:
Error encountered sending record to topic outputActivities for task 0_0 due to:
...
Caused by: org.apache.kafka.common.errors.SerializationException:
Can't serialize data [com.example.connector.model.Activity@497b37ff] for topic [outputActivities]
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
Incompatible types: declared root type ([simple type, class com.example.connector.model.Material]) vs com.example.connector.model.Activity

The last line here is important, as the "declared root type" is from the Material class, but not the Activity class, which is probably the source error.这里的最后一行很重要,因为“声明的根类型”来自Material class,而不是Activity class,这可能是源错误。

Again, when I only activate the second use case before starting the application, everything works fine.同样,当我在启动应用程序之前只激活第二个用例时,一切正常。 So I assume that the "Material" processor somehow interfers with the "Activities" processor (or its serializer), but I don't know when and where.所以我假设“材料”处理器以某种方式干扰了“活动”处理器(或其序列化程序),但我不知道何时何地。


Setup设置

1.) use case: "Materials" 1.) 用例:“材料”

  • one input stream -> transformation -> one output stream一个输入 stream -> 转换 -> 一个 output stream
@Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {...}

application.properties

spring.cloud.stream.kafka.streams.binder.functions.processMaterials.applicationId=MaterialsAppId
spring.cloud.stream.bindings.processMaterials-in-0.destination=inputMaterialsRaw
spring.cloud.stream.bindings.processMaterials-out-0.destination=outputMaterials

2.) use case: "Activities" 2.) 用例:“活动”

  • two input streams -> joining -> one output stream两个输入流 -> 加入 -> 一个 output stream
@Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {...}

application.properties

spring.cloud.stream.kafka.streams.binder.functions.processActivities.applicationId=ActivitiesAppId
spring.cloud.stream.bindings.processActivities-in-0.destination=inputActivitiesRaw
spring.cloud.stream.bindings.processActivities-in-1.destination=inputAssignees
spring.cloud.stream.bindings.processActivities-out-0.destination=outputActivities

The two processors are also defined as stream function in application.properties : spring.cloud.stream.function.definition=processActivities;processMaterials The two processors are also defined as stream function in application.properties : spring.cloud.stream.function.definition=processActivities;processMaterials

Thanks!谢谢!

Update - Here's how I use the processors in the code:更新 - 这是我在代码中使用处理器的方式:

Implementation执行

// Material model
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class MaterialRaw {
    private String id;
    private String name;
}

@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Material {
    private String id;
    private String name;
}

// Material processor
@Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {
    return materialsRawStream -> materialsRawStream .map((recordKey, materialRaw) -> {
        // some transformation
        final var newId = materialRaw.getId() + "---foo";
        final var newName = materialRaw.getName() + "---bar";
        final var material = new Material(newId, newName);

        // output
        return new KeyValue<>(recordKey, material); 
    };
}
// Activity model
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class ActivityRaw {
    private String id;
    private String name;
}

@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Assignee {
    private String id;
    private String assignedAt;
}

/**
 * Combination of `ActivityRaw` and `Assignee`
 */
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Activity {
    private String id;
    private Integer number;
    private String assignedAt;
}

// Activity processor
@Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {
    return (activitiesRawStream, assigneesStream) -> { 
        final var joinWindow = JoinWindows.of(Duration.ofDays(30));

        final var streamJoined = StreamJoined.with(
            Serdes.String(),
            new JsonSerde<>(ActivityRaw.class),
            new JsonSerde<>(Assignee.class)
        );

        final var joinedStream = activitiesRawStream.leftJoin(
            assigneesStream,
            new ActivityJoiner(),
            joinWindow,
            streamJoined
        );

        final var mappedStream = joinedStream.map((recordKey, activity) -> {
            return new KeyValue<>(recordKey, activity);
        });

        return mappedStream;
    };
}

This turns out to be an issue with the way the binder infers Serde types when there are multiple functions with different outbound target types, one with Activity and another with Material in your case.当有多个具有不同出站目标类型的函数时,绑定器推断Serde类型的方式存在问题,在您的情况下,一个具有Activity ,另一个具有Material We will have to address this in the binder.我们将不得不在活页夹中解决这个问题。 I created an issue here .我在这里创建了一个问题。

In the meantime, you can follow this workaround.同时,您可以遵循此解决方法。

Create a custom Serde class as below.创建一个自定义Serde class,如下所示。

public class ActivitySerde extends JsonSerde<Activity> {}

Then, explicitly use this Serde for the outbound of your processActivities function using configuration.然后,使用配置显式将此Serde用于您的processActivities function 的出站。

For eg,例如,

spring.cloud.stream.kafka.streams.bindings.processActivities-out-0.producer.valueSerde=com.example.so65003575.ActivitySerde

Please change the package to the appropriate one if you are trying this workaround.如果您尝试此解决方法,请将 package 更改为适当的。

Here is another recommended approach.这是另一种推荐的方法。 If you define a bean of type Serde with the target type, that takes precedence as the binder will do a match against the KStream type.如果您使用目标类型定义Serde类型的 bean,则优先,因为绑定器将与KStream类型进行匹配。 Therefore, you can also do it without defining that extra class in the above workaround.因此,您也可以在上述解决方法中不定义额外的 class 的情况下执行此操作。

@Bean
public Serde<Activity> activitySerde() {
  return new JsonSerde(Activity.class);
}

Here are the docs where it explains all these details.这是解释所有这些细节的文档

You need to specify which binder to use for each function s.c.s.bindings.xxx.binder=... .您需要为每个 function s.c.s.bindings.xxx.binder=...指定要使用的活页夹。

However, without that, I would have expected an error such as "multiple binders found but no default specified", which is what happens with message channel binders.但是,如果没有它,我会预料到会出现诸如“找到多个活页夹但未指定默认值”之类的错误,这就是消息通道活页夹会发生的情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法使用 Spring Cloud Streams 反序列化 Kafka Stream 中的数据 - Can't deserialize data in the Kafka Stream using Spring Cloud Streams 在 Spring 引导中暂停/启动 Kafka Stream 处理器 - Pause/Start Kafka Stream processors in Spring boot Spring 云 Stream Function:调用 Z86408593C34AF77FDD160DF932F8B52<t,r> 通过 REST 调用和 output 它到一个 KAFKA 主题</t,r> - Spring Cloud Stream Function : Invoke Function<T,R> via REST call and output it to a KAFKA Topic Spring Cloud Stream-无法配置kafka的经纪人地址 - Spring Cloud Stream - Can't get configured kafka's broker address 当没有 kafka 代理运行时,如何出于开发目的禁用 Spring Cloud 流? - How can I disable spring cloud stream for development purpose when there are not kafka broker running? Spring Cloud 流/Kafka 异常 - Spring cloud stream / Kafka exceptions Spring 云卡夫卡 Stream StreamsUncaughtExceptionHandler - Spring cloud Kafka Stream StreamsUncaughtExceptionHandler spring 云 stream kafka 背压 - spring cloud stream kafka backpressure Spring Cloud Stream向Kafka主题发送数据失败 - Spring Cloud Stream Sending Data to Kafka Topics Fails 如何使用 Spring Cloud Kafka Stream 3.1 创建生产者 - How can create a producer using Spring Cloud Kafka Stream 3.1
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM