简体   繁体   English

Apache Flink 1.52 Rowtime时间戳为null

[英]Apache flink 1.52 Rowtime timestamp is null

I am doing some query with the following code: 我正在使用以下代码进行一些查询:

    env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
    DataStream<Row> ds = SourceHelp.builder().env(env).consumer010(MyKafka.builder().build().kafkaWithWaterMark2())
            .rowTypeInfo(MyRowType.builder().build().typeInfo())
            .build().source4();
    //,proctime.proctime,rowtime.rowtime
    String sql1 = "select a,b,max(rowtime)as rowtime from user_device group by a,b";
    DataStream<Row> ds2 = TableHelp.builder().tableEnv(tableEnv).tableName("user_device").fields("a,b,rowtime.rowtime")
            .rowTypeInfo(MyRowType.builder().build().typeInfo13())
            .sql(sql1).in(ds).build().result();

    ds2.print();
    // String sql2 = "select a,count(b) as b from user_device2 group by a";
    String sql2 = "select a,count(b) as b,HOP_END(rowtime,INTERVAL '5' SECOND,INTERVAL '30' SECOND) as c from user_device2 group by HOP(rowtime, INTERVAL '5' SECOND, INTERVAL '30' SECOND),a";
    DataStream<Row> ds3 = TableHelp.builder().tableEnv(tableEnv).tableName("user_device2").fields("a,b,rowtime.rowtime")
            .rowTypeInfo(MyRowType.builder().build().typeInfo14())
            .sql(sql2).in(ds2).build().result();

    ds3.print();
    env.execute("test");

note: For sql1, I use max function with rowtime, it is not working, and following Exception is thrown: 注意:对于sql1,我将max函数与rowtime一起使用,它不起作用,并引发以下Exception:

Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: java.lang.RuntimeException: Rowtime timestamp is null. 线程“主”中的异常org.apache.flink.runtime.client.JobExecutionException:java.lang.RuntimeException:行时间时间戳为null。 Please make sure that a proper TimestampAssigner is defined and the stream environment uses the EventTime time characteristic. 请确保定义了正确的TimestampAssigner,并且流环境使用EventTime时间特征。 at org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:625) at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123) at com.aicaigroup.water.WaterTest.testRowtimeWithMoreSqls5(WaterTest.java:158) at com.aicaigroup.water.WaterTest.main(WaterTest.java:20) Caused by: java.lang.RuntimeException: Rowtime timestamp is null. 在com.aicaigroup.water的org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)的org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:625) com.aicaigroup.water.WaterTest.main(WaterTest.java:20)处的.WaterTest.testRowtimeWithMoreSqls5(WaterTest.java:158)原因:java.lang.RuntimeException:行时间时间戳为null。 Please make sure that a proper TimestampAssigner is defined and the stream environment uses the EventTime time characteristic. 请确保定义了正确的TimestampAssigner,并且流环境使用EventTime时间特征。 at DataStreamSourceConversion$24.processElement(Unknown Source) at org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:67) at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:558) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:533) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:513) at org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.collect(OperatorChain.java:628) at org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.collect(OperatorChain.java:581) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:679) at org.apache.flink.streami 在DataStreamSourceConversion $ 24.processElement(未知源)在org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:67)在org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java) :66),位于org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java)的org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain.java:558)上:533),位于org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:513),位于org.apache.flink.streaming.runtime.tasks.OperatorChain $ BroadcastingOutputCollector.collect(OperatorChain.java :628),位于org.apache.flink.streaming.runtime.tasks.OperatorChain $ BroadcastingOutputCollector.collect(OperatorChain.java:581),位于org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java :679),位于org.apache.flink.streami ng.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:657) at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51) at com.aicaigroup.TableHelp$1.processElement(TableHelp.java:42) at com.aicaigroup.TableHelp$1.processElement(TableHelp.java:39) at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:558) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:533) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:513) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:679) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOpe org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)上的ng.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:657)在com.aicaigroup.TableHelp $ 1.processElement (TableHelp.java:42),位于com.aicaigroup.TableHelp $ 1.processElement(TableHelp.java:39),位于org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66),位于org.apache org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:533)的.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain.java:558)位于org.apache org.apache的.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:513)在org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:679)在org.apache .flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOpe rator.java:657) at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:558) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:533) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:513) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:679) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:657) at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51) at org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:151) at org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(Gro org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41)上的rator.java:657)org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain。 java:558)在org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:533)在org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain。 java:513)在org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:679)在org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator。 java:657),位于org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51),位于org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:151)在org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(Gro upAggProcessFunction.scala:39) at org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:104) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703) at java.lang.Thread.run(Thread.java:748) 2018-09-17 09:51:53.679 [Kafka 0.10 Fetcher for Source: Custom Source -> Map -> from: (a, b, rowtime) -> select: (a, b, CAST(rowtime) AS rowtime) (2/8)] INFO oakafka.clients.consumer.internals.AbstractCoordinator - Discovered coordinator 172.16.11.91:9092 (id: 2147483647 rack: null) for group test. org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88)处的upAggProcessFunction.scala:39)在org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java: 202)在org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:104)在org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)在org .apache.flink.runtime.taskmanager.Task.run(Task.java:703)at java.lang.Thread.run(Thread.java:748)2018-09-17 09:51:53.679 [Kafka 0.10获取源代码:自定义源->映射->从:(a,b,行时间)->选择:(a,b,CAST(行时间)AS行时间)(2/8)]信息oakafka.clients.consumer.internals.AbstractCoordinator-发现了用于小组测试的协调器172.16.11.91:9092(标识:2147483647机架:空)。

then I tried to update sql1 like this "select a,b,rowtime from user_device", and it works. 然后我尝试像这样“从user_device选择a,b,rowtime”更新sql1,并且它可以工作。 So how to fix the error? 那么如何解决错误? First sql should use group by, and second sql should use rowtime by timeWindow. 第一个sql应该使用group by,第二个sql应该使用timeWindow。 3QS 3QS

I started flink from 1.6 , meet the similar question like yours. 我从1.6开始flink,遇到像您一样的类似问题。 Solved by the those steps : 通过这些步骤解决了:

  • using assignTimestampsAndWatermarks , just use the default and normal implement BoundedOutOfOrdernessTimestampExtractor. 使用assignTimestampsAndWatermarks,只需使用默认的常规实现BoundedOutOfOrdernessTimestampExtractor。 You need write the extractTimestamp function to extract timestamp value and declare window interval in the constructor. 您需要编写extractTimestamp函数以提取时间戳记值并在构造函数中声明窗口间隔。
  • append ,proctime.proctime,rowtime.rowtime at the end of fields (i'm using fromDataStream(Flink 1.6) to convert stream as table) 在字段末尾追加,proctime.proctime,rowtime.rowtime(我正在使用fromDataStream(Flink 1.6)将流转换为表)
  • if you want use the exist field as rowtime. 如果要使用existing字段作为行时间。 for example, data source fields is "a,clicktime,c" , you can declare "a,clicktime.rowtime,c" 例如,数据源字段是“ a,clicktime,c”,则可以声明“ a,clicktime.rowtime,c”

Wish it can help you. 希望它能对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM