过滤Flink元组

Question

I'm writing a program for stream processing in Scala using Flink. 我正在编写一个使用Flink在Scala中进行流处理的程序。 I have a datastream which I first map to tuples containg json4s JValues. 我有一个数据流，我首先将其映射到包含json4s JValues的元组。 Now I want to filter these tuples based on these JValues. 现在，我想基于这些JValue过滤这些元组。 I thought this would be simple but I can't find any good example of how to filter Flink tuples by their columns. 我以为这很简单，但是找不到如何通过Flink元组的列过滤的好例子。 Does anyone know how to do this? 有谁知道如何做到这一点？ Thanks 谢谢

Answer 1

Instead of mapping to tuples, you could simply map to case classes and filter out unneeded stuff: 除了映射到元组，您还可以映射到案例类并过滤掉不需要的内容：

// StreamingJob.scala

...

val filteredEvents = content
      .map(x => Event.toCaseClass(x))
      .filter(x => x.value == true)

...

// Event.scala

case class Event(
                  id: String,
                  value: Int,
                )
object Event {
  implicit val formats = DefaultFormats

  def toCaseClass(str: String) =
    parse(str).extract[Event]
}

Answer 2

The question seems a little too undefined for me but maybe, does this not work? 这个问题对我来说似乎太不确定了，但是也许不行吗？

// stream contains stuff like these in a flink tuple 
//(custom deserializer of array to tuple2???)
val jsonExample = """["foo", "bar"]"""

val stream: DataStream[Tuple2[JString, JString]] = ???
val filteredStream = stream.filter(x => x.getField(0).extract[String] == "foo")

Id say it would be better not to use flink tuples if you are writing scala though. 同上，如果您正在编写scala，最好不要使用flink元组。 Go for case classes or at least scala tuples maybe? 去案例类或至少scala元组？

过滤Flink元组

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-01-17 11:18:27

解决方案2
0 2018-01-18 18:25:47

过滤Flink元组

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-01-17 11:18:27

解决方案2 0 2018-01-18 18:25:47

解决方案1
0 已采纳 2018-01-17 11:18:27

解决方案2
0 2018-01-18 18:25:47