简体   繁体   English

Flink 对单个源使用多个数据类

[英]Flink use multiple data classes for a single source

Some code:一些代码:

implicit val formats = Serialization.formats(NoTypeHints)

case class DataClass(id: String, name: String)    

val dataSource = env
      .addSource(new FlinkKinesisConsumer[String](s"data-stream-$stage", new SimpleStringSchema, consumerConfig))
      .uid(s"data-stream-$stage-source-id").name("dataSource")
      .map(json => read[DataClass](json))

Here I am taking data from kinesis stream and do serialization into my data class.在这里,我从 kinesis stream 中获取数据,并对我的数据 class 进行序列化。 Everything works fine, but now there is a need to add the ability to receive data in one more additional format (eg DataClassSecond )一切正常,但现在需要添加以另一种格式接收数据的功能(例如DataClassSecond

One of the options, add an additional data source and process them in your own stream.其中一个选项是添加一个额外的数据源并在您自己的 stream 中处理它们。

But this requires an additional kinesis queue.但这需要一个额外的运动队列。 and I'm not sure if this is a good approach Is there any approach to receive different data from kinesis and then, depending on the type, split the stream?而且我不确定这是否是一个好方法是否有任何方法可以从 kinesis 接收不同的数据,然后根据类型拆分 stream?

You may try to filter the DataStream[String] based on fields, so that you will get two or more streams that only contain elements with proper JSON format.您可以尝试根据字段filter DataStream[String] ,这样您将获得两个或更多只包含具有正确 JSON 格式的元素的流。

So the simplest way to do it would be something like:所以最简单的方法是这样的:

val streamDataClass = sourceStream.filter(_.contains("name"))
val streamDataClassSecond = sourceStream.filter(_.contains("surname"))

This will only work if the name and surname are unique to each DataClass .这仅在namesurname对每个DataClass都是唯一的情况下才有效。 A little more efficient thing to do probably would be to first map the DataStream to some common format or use something like Either as deserialization result and then check if it was successful.更有效的做法可能是首先mapDataStream转换为某种通用格式,或者使用Either作为反序列化结果,然后检查它是否成功。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM