简体   繁体   中英

Filter Flink tuples

I'm writing a program for stream processing in Scala using Flink. I have a datastream which I first map to tuples containg json4s JValues. Now I want to filter these tuples based on these JValues. I thought this would be simple but I can't find any good example of how to filter Flink tuples by their columns. Does anyone know how to do this? Thanks

Instead of mapping to tuples, you could simply map to case classes and filter out unneeded stuff:

// StreamingJob.scala

...

val filteredEvents = content
      .map(x => Event.toCaseClass(x))
      .filter(x => x.value == true)

...

// Event.scala

case class Event(
                  id: String,
                  value: Int,
                )
object Event {
  implicit val formats = DefaultFormats

  def toCaseClass(str: String) =
    parse(str).extract[Event]
}

The question seems a little too undefined for me but maybe, does this not work?

// stream contains stuff like these in a flink tuple 
//(custom deserializer of array to tuple2???)
val jsonExample = """["foo", "bar"]"""

val stream: DataStream[Tuple2[JString, JString]] = ???
val filteredStream = stream.filter(x => x.getField(0).extract[String] == "foo")

Id say it would be better not to use flink tuples if you are writing scala though. Go for case classes or at least scala tuples maybe?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM