在Kafka Streams应用程序中，有没有办法使用输出主题的通配符列表定义拓扑？

Question

I have multi-schema Kafka Streams application that enriches a record via a join to a KTable, and then passes the enriched record along. 我有多模式Kafka Streams应用程序，通过连接到KTable来丰富记录，然后传递丰富的记录。

The input topic naming format is currently well defined but I'm changing this to a wildcard. 输入主题命名格式目前已明确定义，但我将其更改为通配符。 I want to determine the input topic of each record, derive the output topic via regex replacement, and send it on. 我想确定每条记录的输入主题，通过正则表达式替换导出输出主题，然后发送它。

Eg While listening to event.raw.* a record comes in on event.raw.foo and I wish to pass it out on event.foo . 例如，在收听event.raw.*记录进来的event.raw.foo ，我想传递出来的event.foo 。

I realise I can get the input topics via the Processor API: 我意识到我可以通过Processor API获取输入主题：

public class EnrichmentProcessor extends AbstractProcessor<String, GenericRecord> {

    @Override
    public void process(String key, GenericRecord value) {
        //Do Join...

        //Determine output topic and forward
        String outputTopic = context().topic().replaceFirst(".raw.", ".");
        context().forward(key, value, To.child(outputTopic));
        context().commit();
    }
}

But this doesn't help me when I'm trying to define my Topology because I have no way of knowing up front what my output topic is going to be. 但是当我尝试定义拓扑时，这对我没有帮助，因为我无法预先知道我的输出主题是什么。

  InternalTopologyBuilder topologyBuilder = new InternalTopologyBuilder();
        topologyBuilder.addSource("SOURCE", stringDeserializer, genericRecordDeserializer, "event.raw.*")
        .addProcessor("ENRICHER", EnrichmentProcessor::new, "SOURCE")
        .addSink("OUTPUT", outputTopic, stringSerializer, genericRecordSerializer, "ENRICHER"); // How can I register all possible output topics here?

Has anyone solved a situation like this before? 以前有人解决过这样的情况吗？

I know that if I had a list of possible output-topic names up front I could have multiple sinks defined on the topology but I'm not going to. 我知道如果我预先列出了可能的输出主题名称，我可以在拓扑上定义多个接收器，但我不会这样做。

Is there a way I can define the topology to have dynamically allocated output topic names when I dont't have a hard coded list of possible output topic names up front? 当我没有预先设置可能的输出主题名称的硬编码列表时，是否有一种方法可以定义拓扑以具有动态分配的输出主题名称？

Answer 1

This should be possible: You can use Topology#addSink(..., new TopicNameExtractor(){...}, ...) to dynamically set an output topic name. 这应该是可能的：您可以使用Topology#addSink(..., new TopicNameExtractor(){...}, ...)来动态设置输出主题名称。 TopicNameExtractor has access to the RecordContext that allows you to get the input topic name via context.topic() . TopicNameExtractor可以访问RecordContext ，它允许您通过context.topic()获取输入主题名称。 Hence, you should be able to compute the output topic name, base on the input topic name. 因此，您应该能够根据输入主题名称计算输出主题名称。

在Kafka Streams应用程序中，有没有办法使用输出主题的通配符列表定义拓扑？

问题描述

1 个解决方案

解决方案1
0 2019-06-24 22:59:39

在Kafka Streams应用程序中，有没有办法使用输出主题的通配符列表定义拓扑？

问题描述

1 个解决方案

解决方案1 0 2019-06-24 22:59:39

解决方案1
0 2019-06-24 22:59:39