[英]Apache Flink, key two datastreams with similar field string value but NOT the same
[英]Apache Flink join different DataStreams on specific key
我有兩個DataStreams
,第一個稱為DataStream<String> source
,它從消息代理接收記錄,第二個是SingleOutputOperator<Event> events
,這是將源映射到Event.class
的結果。
我有一個需要使用SingleOutputOperator<Event> events
和其他使用DataStream<String> source
的用例。 在使用DataStream<String> source
的用例之一中,我需要在應用一些過濾器后加入SingleOutputOperator<String> result
,並避免將 map source
再次放入Event.class
因為我已經完成了該操作並且Stream
,我需要將每條記錄搜索到SingleOutputOperator<String> result
到SingleOutputOperator<Event> events
,並應用另一個 map 來導出SingleOutputOperator<EventOutDto> out
。
這是作為示例的想法:
DataStream<String> source = env.readFrom(source);
SingleOutputOperator<Event> events = source.map(s -> mapper.readValue(s, Event.class));
public void filterAndJoin(DataStream<String> source, SingleOutputOperator<Event> events){
SingleOutputOperator<String> filtered = source.filter(s -> new FilterFunction());
SingleOutputOperator<EventOutDto> result = (this will be the result of search each record
based on id in the filtered stream into the events stream where the id must match and return the event if found)
.map(event -> new EventOutDto(event)).addSink(new RichSinkFunction());
}
我有這個代碼:
filtered.join(events)
.where(k -> {
JsonNode tree = mapper.readTree(k);
String id = "";
if (tree.get("Id") != null) {
id = tree.get("Id").asText();
}
return id;
})
.equalTo(e -> {
return e.Id;
})
.window(TumblingEventTimeWindows.of(Time.seconds(1)))
.apply(new JoinFunction<String, Event, BehSingleEventTriggerDTO>() {
@Override
public EventOutDto join(String s, Event event) throws Exception {
return new EventOutDto(event);
}
})
.addSink(new SinkFunction());
在上面的代碼中一切正常, ids
是相同的,所以基本上where(id).equalTo(id)
應該可以工作,但是這個過程永遠不會到達apply
function。
觀察: Watermark
被分配了相同的時間戳
問題:
我通過這樣做解決了加入問題:
SingleOutputStreamOperator<ObjectDTO> triggers = candidates
.keyBy(new KeySelector())
.intervalJoin(keyedStream.keyBy(e -> e.Id))
.between(Time.milliseconds(-2), Time.milliseconds(1))
.process(new new ProcessFunctionOne())
.keyBy(k -> k.otherId)
.process(new ProcessFunctionTwo());
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.