[英]How to set the type of array with dataset in spark scala
I have a source data like this: 我有这样的源数据:
{A:123,B:"Hello world",C:[{D:123,E:"Spark"}]}
And i have a Object: 我有一个对象:
case class TestClass (A:Int;B:String;C:???)
val obj:Dataset[TestClass] = df.as[TestClass]
How should I define the type of C? 我应该如何定义C的类型?
One option 一种选择
case class Nested(D: Long, E: String)
case class TestClass (A: Long, B:String, C: Seq[Nested])
Usage: 用法:
spark.read.json(sc.parallelize(
Seq("""{"A": 123, "B": "Hello world", "C": [{"D": 123, "E": "Spark"}]}"""
))).as[TestClass].show
+---+-----------+-------------+
| A| B| C|
+---+-----------+-------------+
|123|Hello world|[[123,Spark]]|
+---+-----------+-------------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.