简体   繁体   English

主题中多个分区的Spark结构化流

[英]Spark Structured Streaming for multiple partitions in a topic

How do we structure the JSON for multiple partitions in the Spark structured streaming. 我们如何为Spark结构化流中的多个分区构造JSON。 The below example which I have pasted here for only one partition. 下面的示例仅粘贴了一个分区。 Appreciate your help. 感谢您的帮助。

spark.readStream().format("kafka")
        .option("kafka.bootstrap.servers", "****")
        .option("subscribePattern", "****.*")
        .option("startingOffsets", "{\"Topic01\": {\"0\":250, \"1\": -1}}").load();

You can use your favorite JSON library to create the string. 您可以使用自己喜欢的JSON库来创建字符串。 Here is an example of json4s: 这是json4s的示例:

scala> import org.json4s.jackson.Serialization
import org.json4s.jackson.Serialization

scala> import org.json4s.NoTypeHints
import org.json4s.NoTypeHints

scala> implicit val formats = Serialization.formats(NoTypeHints)
formats: org.json4s.Formats{val dateFormat: org.json4s.DateFormat; val typeHints: org.json4s.TypeHints} = org.json4s.Serialization$$anon$1@7c206b14

scala> val offsets = Map("topic1" -> Map("0" -> 1, "1" -> -1, "2" -> -2), "topic2" -> Map("0" -> 0, "1" -> -1))
offsets: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]] = Map(topic1 -> Map(0 -> 1, 1 -> -1, 2 -> -2), topic2 -> Map(0 -> 0, 1 -> -1))

scala> Serialization.write(offsets)
res0: String = {"topic1":{"0":1,"1":-1,"2":-2},"topic2":{"0":0,"1":-1}} 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM