簡體   English   中英

Kafka流:無效的拓撲:尚未添加StateStore

[英]Kafka streams: Invalid topology: StateStore is not added yet

我們的目標是實現以下架構。 並且最重要的是能夠讀取主題T1的所有數據(從所有分區中)。

在此處輸入圖片說明

我們面臨的問題是,我們無法在由不同構建器創建的兩個節點之間建立聯接(每個實例中都有兩個不同的構建器)。 在每種情況下,我們都創建了兩個構建器(B1,B2)。 B1創建一個源處理器,該處理器從T1主題的所有分區讀取數據,因此每個實例都有一個唯一的ID。 B2從T2的一個分區的一個分區中讀取數據。 稍后,當我們加入時,我們會收到此錯誤無效拓撲:未添加StateStore聚合流存儲,但是Cuz B1和B2具有不同的APP_ID。

這是我們的代碼:

StrmApp類

public class StrmApp extends StrmProc {
    private StreamsBuilder myBuilder;
    private Validator<String, Data> dataValidator;
    private Properties ownBuilderProps;
    private KafkaStreams ownStreams;

    public StrmApp(ValidDataService dataService, ProcessConfig config, ProcessListener listener) {
        super(dataService, config, listener);
        myBuilder = new StreamsBuilder();
        dataValidator = getValidDataService().getValidator(String.class, Data.class);
        ownBuilderProps = new Properties();
        ownBuilderProps.putAll(getProperties());
        // Unique ID for each instance (different consumer group)
        ownBuilderProps.put(StreamsConfig.APPLICATION_ID_CONFIG, UUID.randomUUID());
    }

    private KTable<String, TheDataList> globalStream() {

        // KStream of records from T1 topic using String and TheDataSerde deserializers
        KStream<String, Data> trashStream = getOwnBuilder().stream("T1", Consumed.with(Serdes.String(), SerDes.TheDataSerde));

        // Apply an aggregation operation on the original KStream records using an intermediate representation of a KStream (KGroupedStream)
        KGroupedStream<String, Data> kGroupedStream = trashStream.groupByKey();

        // Describe how a StateStore should be materialized (as a KTable).
        // In our case we are using the default RocksDB back-ends by providing "vdp-aggregated-stream-store" as a state store name
        Materialized<String, TheDataList, KeyValueStore<Bytes, byte[]>> materialized = Materialized.as("aggregated-stream-store");
        materialized = materialized.withValueSerde(SerDes.TheDataListSerde);

        // Return a KTable
        return kGroupedStream.aggregate(() -> new TheDataList(), (key, value, aggregate) -> {
            if (!value.getValideData())
                aggregate.getList().removeIf((t) -> t.getTimestamp() <= value.getTimestamp());
            else
                aggregate.getList().add(value);
            return aggregate;
        }, materialized);
    }

    private Data tombstone(String Vid) {
        Data d = new Data();
        d.setVid(Vid);
        d.setValideData(false);
        d.setTimestamp(System.currentTimeMillis());
        return d;
    }

    @Override
    public void run() {
        /* read from topic 2 (T2) - we want to only read one partition */
        KStream<String, Data> inStream = getBuilder()
                .stream(getProcessConfig().getTopicConfig().getTopicIn(), Consumed.with(Serdes.String(), SerDes.TheDataSerde))
                .filter(getValidDataService().getValidator(String.class, Data.class));

        /* Read all partitions from topic 1 (T1) - we want to read from all partitions (P1, P2 and P3) */
        KTable<String, TheDataList> ft = globalStream();

        // ERROR: Invalid topology: StateStore vdp-aggregated-stream-store is not added yet.
        // When it comes to do the join it raises this error
        // I think because two builders have different APP_ID
        logger.warn("##JOIN:");
        /* join bteween data coming from T1 with data coming from T2 */
        KStream<String, TheDataList> validated = inStream.join(ft,
                new ValueJoiner<Data, TheDataList, TheDataList>() {
                    @Override
                    public TheDataList apply(Data valid, TheDataList ivalids) {
                        ivalids.getList().forEach((c) -> {
                            dataValidator.validate(c, valid);
                        });
                        return ivalids;
                    }
                });

        // ...... some code

        ownStreams = StreamTools.startKStreams(getOwnBuilder(), getOwnBuilderProps(), this, this);
        super.startStreams();
    }

    private Properties getOwnBuilderProps() {
        return ownBuilderProps;
    }

    private StreamsBuilder getOwnBuilder() {
        // return getBuilder();
        return myBuilder;
    }

    // .......
}

StrmProc類

public abstract class StrmProc extends AProcess {
    private final StreamsBuilder builder;

    public StrmProc(ValidDataService dataService, ProcessConfig config, ProcessListener listener) {
        super(dataService, config, listener);
        this.builder = new StreamsBuilder();
    }

    protected final StreamsBuilder getBuilder() {
        return builder;
    }

    protected final KafkaStreams startStreams() {
        return StreamTools.startKStreams(getBuilder(), getProperties(), this, this);

    }

    // ........

}

AProcess類

public abstract class AProcess implements Process {
    private final Properties propertie;
    private final ProcessConfig config;
    private final ValidDataService dataService;
    private final ProcessListener listener;

    protected AProcess(ValidDataService dataService, ProcessConfig config, ProcessListener listener) {
        super();
        this.dataService = dataService;
        this.propertie = getProperties(config);
        this.config = config;
        this.listener = listener;
    }

    private Properties getProperties(ProcessConfig config) {
        Properties kafkaProperties = new Properties();
        kafkaProperties = new Properties();
        kafkaProperties.put(StreamsConfig.APPLICATION_ID_CONFIG, config.getApp());
        kafkaProperties.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, config.getBootstrapServerUrl());
        kafkaProperties.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
        return kafkaProperties;
    }

    protected Properties getProperties() {
        return propertie;
    }

    protected ProcessConfig getProcessConfig() {
        return config;
    }

    protected ValidDataService getValidDataService() {
        return dataService;
    }

    // .......

}

請,如何使用Kafka流實現此目標?

為了在Kafka Streams上使用join,您需要使用單個StreamsBuilder實例,而不是兩個(在您的情況下,是兩個實例-變量inStreamft )。

通常,Kafka Streams會引發異常TopologyException: Invalid topology: StateStore is not added yet如果未將KeyValueStore添加到StreamsBuilder實例中,則尚未添加StreamsBuilderstreamsBuilder.addStateStore(storeBuilder)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM