简体   繁体   English

Kafka Streams:使用 Spring Cloud Stream 为每组主题定义多个 Kafka Streams

[英]Kafka Streams: Define multiple Kafka Streams using Spring Cloud Stream for each set of topics

I am trying to do a simple POC with Kafka Streams.我正在尝试使用 Kafka Streams 做一个简单的 POC。 However I am getting exception while starting the application.但是我在启动应用程序时遇到异常。 I am using Spring-Kafka, Kafka-Streams 2.5.1 with Spring boot 2.3.5 Kafka stream configuration我正在使用 Spring-Kafka、Kafka-Streams 2.5.1 和 Spring 引导 2.3.5 Kafka stream 配置

@Configuration
public class KafkaStreamsConfig {
    private static final Logger log = LoggerFactory.getLogger(KafkaStreamsConfig.class);

    @Bean
    public Function<KStream<String, String>, KStream<String, String>> processAAA() {
        return input -> input.peek((key, value) -> log
                .info("AAA Cloud Stream Kafka Stream processing : {}", input.toString().length()));
    }

    @Bean
    public Function<KStream<String, String>, KStream<String, String>> processBBB() {
        return input -> input.peek((key, value) -> log
                .info("BBB Cloud Stream Kafka Stream processing : {}", input.toString().length()));
    }

    @Bean
    public Function<KStream<String, String>, KStream<String, String>> processCCC() {
        return input -> input.peek((key, value) -> log
                .info("CCC Cloud Stream Kafka Stream processing : {}", input.toString().length()));
    }

    /*
    @Bean
    public KafkaStreams kafkaStreams(KafkaProperties kafkaProperties) {
        final Properties props = new Properties();
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProperties.getBootstrapServers());
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "groupId-1"););
        props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE);
        props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
        props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, JsonSerde.class);
        props.put(JsonDeserializer.VALUE_DEFAULT_TYPE, JsonNode.class);
        final KafkaStreams kafkaStreams = new KafkaStreams(kafkaStreamTopology(), props);
        kafkaStreams.start();
        return kafkaStreams;
    }

    @Bean
    public Topology kafkaStreamTopology() {
        final StreamsBuilder streamsBuilder = new StreamsBuilder();
        streamsBuilder.stream(Arrays.asList(AAATOPIC, BBBInputTOPIC, CCCInputTOPIC));
        return streamsBuilder.build();
    } */
}

application.yaml configured is like below. application.yaml配置如下。 The idea is that I have 3 input and 3 output topics.这个想法是我有 3 个输入和 3 个 output 主题。 The component takes input from input topic and gives output to outputtopic.该组件从输入主题中获取输入,并将 output 提供给输出主题。

spring:
  application.name: consumerapp-1
  cloud:
    function:
      definition: processAAA;processBBB;processCCC
    stream:
      kafka.binder: 
          brokers: 127.0.0.1:9092
          autoCreateTopics: true
          auto-add-partitions: true
      kafka.streams.binder:
          configuration: 
            commit.interval.ms: 1000
            default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
            default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
      bindings:
        processAAA-in-0:
          destination: aaaInputTopic
        processAAA-out-0:
          destination: aaaOutputTopic
        processBBB-in-0:
          destination: bbbInputTopic
        processBBB-out-0:
          destination: bbbOutputTopic
        processCCC-in-0:
          destination: cccInputTopic
        processCCC-out-0:
          destination: cccOutputTopic

Exception thrown is抛出的异常是

Caused by: java.lang.IllegalArgumentException: Trying to prepareConsumerBinding public abstract void org.apache.kafka.streams.kstream.KStream.to(java.lang.String,org.apache.kafka.streams.kstream.Produced)  but no delegate has been set.
at org.springframework.util.Assert.notNull(Assert.java:201)
at org.springframework.cloud.stream.binder.kafka.streams.KStreamBoundElementFactory$KStreamWrapperHandler.invoke(KStreamBoundElementFactory.java:134)

Can anyone help me with Kafka Streams Spring-Kafka code samples for processing with multiple input and output topics.任何人都可以帮助我处理 Kafka Streams Spring-Kafka 代码示例,以处理多个输入和 output 主题。

Updates: 21-Jan-2021更新:2021 年 1 月 21 日

After removing all kafkaStreams and kafkaStreamsTopology beans configuration iam getting below message in an infinite loop.删除所有 kafkaStreams 和 kafkaStreamsTopology bean 配置后,我在无限循环中收到以下消息。 The messages consumption is still not working.消息消费仍然无法正常工作。 I have checked the subscription in application.yaml with the @Bean Function definitions.我已经使用@Bean Function 定义检查了 application.yaml 中的订阅。 they all look ok to me but still I get this cross wiring error.他们对我来说都很好,但我仍然得到这个交叉接线错误。 I have replaced the application.properties with application.yaml above我已经用上面的 application.yaml 替换了 application.properties

    [consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1] [Consumer clientId=consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1-consumer, groupId=consumerapp-1] We received an assignment [cccParserTopic-0] that doesn't match our current subscription Subscribe(bbbParserTopic); it is likely that the subscription has changed since we joined the group. Will try re-join the group with current subscription
2021-01-21 14:12:43,336 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator [consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1] [Consumer clientId=consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1-consumer, groupId=consumerapp-1] We received an assignment [cccParserTopic-0] that doesn't match our current subscription Subscribe(bbbParserTopic); it is likely that the subscription has changed since we joined the group. Will try re-join the group with current subscription

I have managed to solve the problem.我已经设法解决了这个问题。 I am writing this for the benefit of others.我写这篇文章是为了他人的利益。 If you want to include multiple streams in your single app jar then the key is in defining multiple application Ids that is one per each of your streams.如果您想在单个应用程序 jar 中包含多个流,那么关键是定义多个应用程序 ID ,每个流一个。 I knew this all along but I was not aware on how to define it.我一直都知道这一点,但我不知道如何定义它。 Finally the answer is something I have managed to dig out after reading the SCSt documentation.最后,答案是我在阅读 SCSt 文档后设法挖掘出来的。 Below is how the application.yaml can be defined.下面是如何定义 application.yaml。 application.yaml is like below application.yaml如下所示

spring:
  application.name: kafkaMultiStreamConsumer
  cloud:
    function:
      definition: processAAA; processBBB; processCCC --> // needed for Imperative @StreamListener
    stream:
      kafka: 
        binder:
          brokers: 127.0.0.1:9092
          min-partition-count: 3
          replication-factor: 2
          transaction:
            transaction-id-prefix: transaction-id-2000
          autoCreateTopics: true
          auto-add-partitions: true
        streams:
          binder:
            functions: 
            // needed for functional
              processBBB: 
                application-id: SampleBBBapplication
              processAAA: 
                application-id: SampleAAAapplication
              processCCC: 
                application-id: SampleCCCapplication
            configuration: 
              commit.interval.ms: 1000            
              default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
              default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde        
      bindings:
      // Below is for Imperative Style programming using 
      // the annotation namely @StreamListener, @SendTo in .java class
        inputAAA:
          destination: aaaInputTopic
        outputAAA:
          destination: aaaOutputTopic
        inputBBB:
          destination: bbbInputTopic
        outputBBB:
          destination: bbbOutputTopic
        inputCCC:
          destination: cccInputTopic
        outputCCC:
          destination: cccOutputTopic
     // Functional Style programming using Function<KStream...> use either one of them
     // as both are not required. If you use both its ok but only one of them works
     // from what i have seen @StreamListener is triggered always.
     // Below is from functional style
        processAAA-in-0:
          destination: aaaInputTopic
          group: processAAA-group
        processAAA-out-0:
          destination: aaaOutputTopic
          group: processAAA-group
        processBBB-in-0:
          destination: bbbInputTopic
          group: processBBB-group
        processBBB-out-0:
          destination: bbbOutputTopic
          group: processBBB-group
        processCCC-in-0:
          destination: cccInputTopic
          group: processCCC-group
        processCCC-out-0:
          destination: cccOutputTopic
          group: processCCC-group

Once above is defined we now need to define individual java classes where the Stream processing logic is implemented.一旦定义了上述内容,我们现在需要定义单独的 java 类,其中实现了 Stream 处理逻辑。 Your Java class can be something like below.您的 Java class 可能如下所示。 Create similarly for other 2 or N streams as per your requirement.根据您的要求,为其他 2 个或 N 个流创建类似的内容。 One example is like below: AAASampleStreamTask.java一个例子如下: AAASampleStreamTask.java

@Component
@EnableBinding(AAASampleChannel.class) // One Channel interface corresponding to in-topic and out-topic
public class AAASampleStreamTask {
    private static final Logger log = LoggerFactory.getLogger(AAASampleStreamTask.class);

    @StreamListener(AAASampleChannel.INPUT)
    @SendTo(AAASampleChannel.OUTPUT)
    public KStream<String, String> processAAA(KStream<String, String> input) {
        input.foreach((key, value) -> log.info("Annotation AAA *Sample* Cloud Stream Kafka Stream processing {}", String.valueOf(System.currentTimeMillis())));
       ...
       // do other business logic
       ...
        return input;
    }
    
    /**
     * Use above or below. Below style is latest startting from ScSt 3.0 if iam not 
     * wrong. 2 different styles of consuming Kafka Streams using SCSt. If we have 
     * both then above gets priority as per my observation
     */     
    /* 
    @Bean
    public Function<KStream<String, String>, KStream<String, String>> processAAA() {
        return input -> input.peek((key, value) -> log.info(
                "Functional AAA *Sample* Cloud Stream Kafka Stream processing : {}", String.valueOf(System.currentTimeMillis())));
       ...
     // do other business logic
       ...
    }
    */
}

The Channel is required if you want to go with Imperative style programming not for functional.如果您想使用命令式编程而不是功能性的 go,则需要通道。 AAASampleChannel.java AAASampleChannel.java

public interface AAASampleChannel {
    String INPUT = "inputAAA";
    String OUTPUT = "outputAAA";

    @Input(INPUT)
    KStream<String, String> inputAAA();

    @Output(OUTPUT)
    KStream<String, String> outputAAA();
}

Looks like you are mixing Spring Cloud Stream and Spring Kafka in the application.看起来您在应用程序中混合了 Spring Cloud Stream 和 Spring Kafka。 When using the binder, you don't need to directly define components required by Spring Kafka such as KafkaStreams and Topology , rather they are created by SCSt implicitly.使用 binder 时,不需要直接定义 Spring Kafka 所需的KafkaStreamsTopology等组件,而是由SCSt隐式创建。 Can you remove the following beans and try again?您可以删除以下 bean 并重试吗?

@Bean
public KafkaStreams kafkaStreams(KafkaProperties kafkaProperties) {

and

@Bean
public Topology kafkaStreamTopology() {

If you are still facing issues, please share a small sample that can be reproducible, that way we can triage it further.如果您仍然面临问题,请分享一个可重现的小样本,这样我们可以进一步对其进行分类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM