[英]FLINK left join three DataStreams using cogroup
我正在尝试将三个流合并为一个流。 尝试联合但无法继续,因为模式不同,如果我合并模式,它会变得太大。
所以,我正在使用 **cogroup ** 进行左连接并返回三个流的元组。
`
DataStream<Tuple3<Schema1, Schema2, Schema3>> mergeJoin = stream1.coGroup(stream2)
.where(genericRecord -> {return SchemaUtils.getKey1(genericRecord);})
.equalTo(genericRecord -> {return SchemaUtils.getKey2(genericRecord);})
.window(TumblingProcessingTimeWindows.of(Time.seconds(3)))
.apply(new LeftOuterJoin1())
.coGroup(stream3)
.where(tuple_Schema1_Schema2->{return SchemaUtils.getKey1(tuple_Schema1_Schema2);})
.equalTo( genericRecord->{ return SchemaUtils.getKey3(genericRecord);})
.window(TumblingProcessingTimeWindows.of(Time.seconds(6)))
.apply(new LeftOuterJoin2());
`
public static class LeftOuterJoin2 implements CoGroupFunction<Tuple2<schema1, schema2>, GenericRecord, Tuple3<chema1, schema2, schema3>> {
@Override
public void coGroup(Iterable<Tuple2<SCHEMA1, SCHEMA2>> iterable, Iterable<GenericRecord> iterable1, Collector<Tuple3<SCHEMA1, SCHEMA2, SCHEMA3>> collector) throws Exception {
final SCHEMA3 NULL_ELEMENT = null;
ObjectMapper mapper = new ObjectMapper();
for (Tuple2<SCHEMA1, SCHEMA2> leftElem : iterable) {
boolean hadElements = false;
for (GenericRecord rightElem : iterable1) {
SCHEMA1 schema1_data = leftElem.f0;
SCHEMA2 schema2_data = leftElem.f1;
SCHEMA3 schema3_data = mapper.readValue(rightElem.toString(), SCHEMA3.class);
collector.collect(new Tuple3<>(schema1_data, schema2_data,schema3_data));
hadElements = true;
}
if (!hadElements) {
SCHEMA1 schema1_data = leftElem.f0;
SCHEMA2 schema2_data = leftElem.f1;
collector.collect(new Tuple3<>(schema1_data, schema2_data,NULL_ELEMENT));
}
}
}
}
While merging the first two streams of tuple with the third on **apply(new LeftOuterJoin2())** \ i'm getting
>
> Cannot resolve method 'apply(LeftOuterJoin2)' error.
Expecting1: A tuple of three streams.
Expecting2: Number of records should be equal to SCHEMA1 count.
创建一个类似于 Flink 的Either
的Either3<Schema1, Schema2, Schema3>
类。 在您的三个传入流中的每一个中,您都将您的记录转换为具有适当字段集的Either3
。 然后您可以键入流并使用KeyedProcessFunction
对数据执行任何您想执行的操作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.