簡體   English   中英

Apache Kafka - KStream 與 KStream 加入最新消息

[英]Apache Kafka - KStream with KStream Join latest messages

我已經創建了 KStreams,我想將它們連接在一起。 兩個流的輸出如下:

流 1:

2    {"CODE":"AAAA96","STATUS":"SUBMITTED","ID":2}

流 2:

26   {"DESCRIPTION":"blah blah blah","QUANTITY":1,"ID_CUSTOMER_ORDER":"GR0100926","ID":26}

我想創建這兩個流的連接流(內部連接),因此我創建了以下 KStream:

KStream<String, String> s_joined = s_order
        .join(s_order_item, (left,right) -> left + right,
                JoinWindows.of(Duration.ofSeconds(30)))
        .mapValues(value -> {
            String[] arrOfstr = value.split("(?<=})");
            JSONObject jl = new JSONObject(arrOfstr[0]);
            JSONObject jr = new JSONObject(arrOfstr[1]);
            JSONObject json = new JSONObject();
            Iterator<String> keys = jl.keys();
            while(keys.hasNext()) {
                String key = keys.next();
                json.put(key, jl.get(key));
            }
            keys = jr.keys();
            while(keys.hasNext()) {
                String key = keys.next();
                json.put(key, jr.get(key));
            }
            return json.toString();
        });

在這個 KStream 中,我只使用了一個連接,我正在更改輸出消息的格式,僅此而已。

通過一個例子,我將解釋我想要做什么:

以下消息發布在窗口內:

流 1

9 {"CODE":"AAAA98","STATUS":"CANCELED","ID":"9"}

流 2

9 {"DESCRIPTION":"blah blah blah","QUANTITY":3,"ID_CUSTOMER_ORDER":"GR0100121","ID":"9"}
9 {"DESCRIPTION":"blah blah blah","QUANTITY":0,"ID_CUSTOMER_ORDER":"GR0100480","ID":"9"}
9 {"DESCRIPTION":"blah blah blah","QUANTITY":1,"ID_CUSTOMER_ORDER":"GR0100606","ID":"9"}
9 {"DESCRIPTION":"blah blah blah","QUANTITY":7,"ID_CUSTOMER_ORDER":"GR0100339","ID":"9"}
9 {"DESCRIPTION":"blah blah blah","QUANTITY":6,"ID_CUSTOMER_ORDER":"GR0100911","ID":"9"}

加入的流

發表了什么

9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":3,"ID_CUSTOMER_ORDER":"GR0100121","ID":"9"}
9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":0,"ID_CUSTOMER_ORDER":"GR0100480","ID":"9"}
9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":1,"ID_CUSTOMER_ORDER":"GR0100606","ID":"9"}
9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":7,"ID_CUSTOMER_ORDER":"GR0100339","ID":"9"}
9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":6,"ID_CUSTOMER_ORDER":"GR0100911","ID":"9"}

我想發表什么

9 {"CODE":"AAAA98","STATUS":"CANCELED","DESCRIPTION":"blah blah blah","QUANTITY":6,"ID_CUSTOMER_ORDER":"GR0100911","ID":"9"}

總之,我只想在窗口內發布最新消息,而不是所有消息。 這可能嗎?

您可以使用groupByKey函數,該函數返回KGroupedStream ,然后使用map/reduce函數以所需的方式對其進行轉換。 請參閱Kafka Streams DSL了解更多信息。

我找到了答案。 實現我想要做的事情的方法是使用功能suppress 更詳細地說,您對groupByKey() ,然后使用Window函數。 最后,聚合分組數據並使用suppress

s_joined.toStream()
        .groupByKey()
        .WindowedBy(...)
        .aggregate(...)
        .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()));

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM