简体   繁体   中英

Flink, why a CoMap returns "DataStream with Product with Serializable" instead of just a DataStream?

I need to understand why eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2")) does not generate a DataStream[Either[String, Int]] but a DataStream[Either[String, Int]] with Product with Serializable . I'm using some API that does accept a DataStream[T] and if I pass them a DataStream[T] with Product with Serializable I got a compile time error. Can someone explain and maybe give me some hint ?

I show you an example:

class FlinkFoo {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment

    // Silly int source
    val eventStream: DataStream[Int] = env.addSource((sc: SourceContext[Int]) => {
      while (true) sc.collect(1)
    })

    // Silly String source
    val otherStream: DataStream[String] = env.addSource((sc: SourceContext[String]) => {
      while (true) sc.collect("1")
    })

    // I need to connect two stream and then flatten them
    val connectedStream2: DataStream[Either[String, Int] with Product with Serializable] = eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2"))

    /* Compile time error !!!!
     * found   : org.apache.flink.streaming.api.scala.DataStream[Either[String,Int] with Product with Serializable]
     * [error]  required: org.apache.flink.streaming.api.scala.DataStream[Either[?,?]]
     * [error] Note: Either[String,Int] with Product with Serializable <: Either[?,?], but class DataStream is invariant in type T.
     * [error] You may wish to define T as +T instead. (SLS 4.5)
     * [error]     fooMethod(connectedStream2)
     * [error]               ^
     **/
    fooMethod(connectedStream2)
  }

  def fooMethod[T, P](dataStream: DataStream[Either[T, P]]): Unit = {
    // do something
  }
}

You might try to add Flink scala implicit serializers and TypeInformation to your scope as follows

import org.apache.flink.streaming.api.scala._

The TypeUtils object is called by the above imported package object; they provide Serializer and required type information for Either as long as for many other entities.

You need those conversions to resolve Either type after Flink generic type resolution and you might explicitly add return type to your assigment in order to achieve that conversion.

val yourEitherStream: DataStream[Either[String, Int]] =
  eventStream
    .connect(otherStream)
    .map(_ => Right(2), _ => Left("2"))

with Product with Serializable mix-in is a refuse of Scala 2.11 issue , resolved by 2.12 (but you can't use it with Flink right now ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM