繁体   English   中英

FLINK 使用 cogroup 左连接三个 DataStreams

[英]FLINK left join three DataStreams using cogroup

我正在尝试将三个流合并为一个流。 尝试联合但无法继续,因为模式不同,如果我合并模式,它会变得太大。

所以,我正在使用 **cogroup ** 进行左连接并返回三个流的元组。

`
DataStream<Tuple3<Schema1, Schema2, Schema3>> mergeJoin = stream1.coGroup(stream2)
        .where(genericRecord -> {return   SchemaUtils.getKey1(genericRecord);})
        .equalTo(genericRecord -> {return   SchemaUtils.getKey2(genericRecord);})
        .window(TumblingProcessingTimeWindows.of(Time.seconds(3)))
        .apply(new LeftOuterJoin1())
        .coGroup(stream3)
        .where(tuple_Schema1_Schema2->{return   SchemaUtils.getKey1(tuple_Schema1_Schema2);})
        .equalTo( genericRecord->{ return SchemaUtils.getKey3(genericRecord);})
        .window(TumblingProcessingTimeWindows.of(Time.seconds(6)))
        .apply(new LeftOuterJoin2());
`
public static class LeftOuterJoin2 implements CoGroupFunction<Tuple2<schema1, schema2>, GenericRecord, Tuple3<chema1, schema2, schema3>> {

        @Override
        public void coGroup(Iterable<Tuple2<SCHEMA1, SCHEMA2>> iterable, Iterable<GenericRecord> iterable1, Collector<Tuple3<SCHEMA1, SCHEMA2, SCHEMA3>> collector) throws Exception {
            final SCHEMA3 NULL_ELEMENT = null;
            ObjectMapper mapper = new ObjectMapper();
            for (Tuple2<SCHEMA1, SCHEMA2> leftElem : iterable) {
                boolean hadElements = false;
                for (GenericRecord rightElem : iterable1) {
                    SCHEMA1 schema1_data = leftElem.f0;
                    SCHEMA2 schema2_data = leftElem.f1;
                    SCHEMA3 schema3_data = mapper.readValue(rightElem.toString(), SCHEMA3.class);
                    collector.collect(new Tuple3<>(schema1_data, schema2_data,schema3_data));
                    hadElements = true;
                }
                if (!hadElements) {
                    SCHEMA1 schema1_data = leftElem.f0;
                    SCHEMA2 schema2_data = leftElem.f1;
                    collector.collect(new Tuple3<>(schema1_data, schema2_data,NULL_ELEMENT));
                }
            }
        }
    }


While merging the first two streams of tuple with the third on **apply(new LeftOuterJoin2())** \ i'm getting 
> 
> Cannot resolve method 'apply(LeftOuterJoin2)' error.

Expecting1: A tuple of three streams.
Expecting2: Number of records should be equal to SCHEMA1 count.

创建一个类似于 Flink 的EitherEither3<Schema1, Schema2, Schema3>类。 在您的三个传入流中的每一个中,您都将您的记录转换为具有适当字段集的Either3 然后您可以键入流并使用KeyedProcessFunction对数据执行任何您想执行的操作。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM