简体   繁体   English

在Kafka Streams应用程序中,有没有办法使用输出主题的通配符列表定义拓扑?

[英]In a Kafka Streams application, is there a way to define a topology with a wildcard list of output topics?

I have multi-schema Kafka Streams application that enriches a record via a join to a KTable, and then passes the enriched record along. 我有多模式Kafka Streams应用程序,通过连接到KTable来丰富记录,然后传递丰富的记录。

The input topic naming format is currently well defined but I'm changing this to a wildcard. 输入主题命名格式目前已明确定义,但我将其更改为通配符。 I want to determine the input topic of each record, derive the output topic via regex replacement, and send it on. 我想确定每条记录的输入主题,通过正则表达式替换导出输出主题,然后发送它。

Eg While listening to event.raw.* a record comes in on event.raw.foo and I wish to pass it out on event.foo . 例如,在收听event.raw.*记录进来的event.raw.foo ,我想传递出来的event.foo

I realise I can get the input topics via the Processor API: 我意识到我可以通过Processor API获取输入主题:

public class EnrichmentProcessor extends AbstractProcessor<String, GenericRecord> {

    @Override
    public void process(String key, GenericRecord value) {
        //Do Join...

        //Determine output topic and forward
        String outputTopic = context().topic().replaceFirst(".raw.", ".");
        context().forward(key, value, To.child(outputTopic));
        context().commit();
    }
}

But this doesn't help me when I'm trying to define my Topology because I have no way of knowing up front what my output topic is going to be. 但是当我尝试定义拓扑时,这对我没有帮助,因为我无法预先知道我的输出主题是什么。

  InternalTopologyBuilder topologyBuilder = new InternalTopologyBuilder();
        topologyBuilder.addSource("SOURCE", stringDeserializer, genericRecordDeserializer, "event.raw.*")
        .addProcessor("ENRICHER", EnrichmentProcessor::new, "SOURCE")
        .addSink("OUTPUT", outputTopic, stringSerializer, genericRecordSerializer, "ENRICHER"); // How can I register all possible output topics here?

Has anyone solved a situation like this before? 以前有人解决过这样的情况吗?

I know that if I had a list of possible output-topic names up front I could have multiple sinks defined on the topology but I'm not going to. 我知道如果我预先列出了可能的输出主题名称,我可以在拓扑上定义多个接收器,但我不会这样做。

Is there a way I can define the topology to have dynamically allocated output topic names when I dont't have a hard coded list of possible output topic names up front? 当我没有预先设置可能的输出主题名称的硬编码列表时,是否有一种方法可以定义拓扑以具有动态分配的输出主题名称?

This should be possible: You can use Topology#addSink(..., new TopicNameExtractor(){...}, ...) to dynamically set an output topic name. 这应该是可能的:您可以使用Topology#addSink(..., new TopicNameExtractor(){...}, ...)来动态设置输出主题名称。 TopicNameExtractor has access to the RecordContext that allows you to get the input topic name via context.topic() . TopicNameExtractor可以访问RecordContext ,它允许您通过context.topic()获取输入主题名称。 Hence, you should be able to compute the output topic name, base on the input topic name. 因此,您应该能够根据输入主题名称计算输出主题名称。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM