簡體   English   中英

Flink 數據流驅逐者

[英]Flink Datastreams Evictor

我正在使用 Flink DataStreams 加入 2 個流(一個 Book stream 和一個 Publisher 流)。 我正在嘗試使用evictor來刪除元素,以防它們從數據庫中刪除,這由變量deleted 表示。

當我在沒有evictor的情況下運行代碼時,它運行良好,但是當我添加evictor時它會失敗。

DataStream<BooksWithPublishers> book_publisher = bookStream
        .join(publishStream)
        .where(value -> value.publisherId)
        .equalTo(value -> value.id)
        .window(GlobalWindows.create())
        .trigger(new ForeverTrigger<>())
        .evictor(new Evictor<CoGroupedStreams.TaggedUnion<Book, Publisher>, GlobalWindow>() {
            @Override
            public void evictBefore(Iterable<TimestampedValue<CoGroupedStreams.TaggedUnion<Book, Publisher>>> elements, int size, GlobalWindow window, EvictorContext evictorContext) {
                Iterator<TimestampedValue<CoGroupedStreams.TaggedUnion<Book, Publisher>>> it = elements.iterator();

                while(it.hasNext()){
                    CoGroupedStreams.TaggedUnion<Book, Publisher> cg = it.next().getValue();

                    Book book = cg.getOne();
                    Publisher pub = cg.getTwo();

                    if(book.deleted || pub.deleted){
                        it.remove();
                    }

                }
            }

            @Override
            public void evictAfter(Iterable<TimestampedValue<CoGroupedStreams.TaggedUnion<Book, Publisher>>> elements, int size, GlobalWindow window, EvictorContext evictorContext) {
            }
        })
        .apply(new JoinFunction<Book, Publisher, BooksWithPublishers>() {
            @Override
            public BooksWithPublishers join(Book first, Publisher second) throws Exception {

                return new BooksWithPublishers(first.id,first.releaseDate,first.title,second);
            }
        });

錯誤:

org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot serialize operator object class org.apache.flink.streaming.api.operators.SimpleUdfStreamOperatorFactory.
    at org.apache.flink.streaming.api.graph.StreamConfig.setStreamOperatorFactory(StreamConfig.java:304)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.setVertexConfig(StreamingJobGraphGenerator.java:694)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:438)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:399)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:390)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.setChaining(StreamingJobGraphGenerator.java:356)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:179)
    at org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:116)
    at org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:908)
    at org.apache.flink.client.StreamGraphTranslator.translateToJobGraph(StreamGraphTranslator.java:50)
    at org.apache.flink.client.FlinkPipelineTranslationUtil.getJobGraph(FlinkPipelineTranslationUtil.java:39)
    at org.apache.flink.client.deployment.executors.PipelineExecutorUtils.getJobGraph(PipelineExecutorUtils.java:56)
    at org.apache.flink.client.deployment.executors.LocalExecutor.getJobGraph(LocalExecutor.java:104)
    at org.apache.flink.client.deployment.executors.LocalExecutor.execute(LocalExecutor.java:82)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1796)
    at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:69)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1782)
    at com.x.x.pipelines.Bootstrapper.boot(Bootstrapper.java:68)
    at com.x.x.Main.main(Main.java:7)
Caused by: java.io.NotSerializableException: com.x.x.pipelines.library.author.index.AuthorIndex
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
        at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
        at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:515)
        at org.apache.flink.streaming.api.graph.StreamConfig.setStreamOperatorFactory(StreamConfig.java:301)
        ... 19 more

我嘗試使用空方法運行帶有.evictor() 的代碼,但它仍然給了我一個錯誤。

為什么我不能使用 evictor()?

問題很可能是您的封閉 class (大概是AuthorIndex)不可序列化,並且您的程序正在嘗試對其進行序列化。 這可以通過創建單獨的 class 而不是使用匿名 class 或創建方法 static 來避免。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM