简体   繁体   中英

NullPointerException exception when using Flink's leftOuterJoinLateral in Scala

I am trying to follow the documentation and create a Table Function to "flatten" some data. The Table Function seems to work fine when using the joinLateral to do the flattening. When using leftOuterJoinLateral though, I get the following error. I'm using Scala and have tried both Table API and SQL with the same result:

Caused by: java.lang.NullPointerException: Null result cannot be stored in a Case Class.

Here is my job:

import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.table.api.scala.StreamTableEnvironment
import org.apache.flink.table.api.scala._
import org.apache.flink.streaming.api.scala._
import org.apache.flink.table.functions.TableFunction

object example_job{
  // Split the List[Int] into multiple rows
  class Split() extends TableFunction[Int] {
    def eval(nums: List[Int]): Unit = {
      nums.foreach(x =>
        if(x != 3) {
          collect(x)
      })
    }
  }

  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.createLocalEnvironment()
    val tableEnv = StreamTableEnvironment.create(env)
    val splitMe = new Split()

    // Create some dummy data
    val events: DataStream[(String, List[Int])] = env.fromElements(("simon", List(1,2,3)), ("jessica", List(3)))
    
    val table = tableEnv.fromDataStream(events, 'name, 'numbers)
      .leftOuterJoinLateral(splitMe('numbers) as 'number)
      .select('name, 'number)
    table.toAppendStream[(String, Int)].print()
    env.execute("Flink jira ticket example")
  }
}

When I change .leftOuterJoinLateral to .joinLateral I get the expected result:

(simon,1)
(simon,2)

When using the .leftOuterJoinLateral I would expect something like:

(simon,1)
(simon,2)
(simon,null) // or (simon, None)
(jessica,null) // or (jessica, None)

Seems like this might be a bug with the Scala API? I wanted to check here first before raising a ticket just in case I'm doing something stupid!

The problem is that Flink per default does expect that all fields of a row are non-null. That's why the program fails when it sees the null result from the outer join operation. In order to accept null values, you either need to disable the null check via

val tableConfig = tableEnv.getConfig
tableConfig.setNullCheck(false)

Or you must specify the result type to tolerate null values, eg specifying a custom POJO output type:

table.toAppendStream[MyOutput].print()

with

class MyOutput(var name: String, var number: Integer) {
  def this() {
    this(null, null)
  }

  override def toString: String = s"($name, $number)"
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM