简体   繁体   English

用于合并的Spark Structtype

[英]Spark Structtype for coalesce

I use Spark 2.0.1 Scala 2.11 我使用Spark 2.0.1 Scala 2.11

How to provide a default value using coalesce for a column that's a StructType ? 如何使用coalesce为一个StructType的列提供默认值?

Say ... 说......

val ss = new StructType().add("x", IntegerType).add("y", IntegerType)

val s = new StructType()
    .add("a", IntegerType)
    .add("b", ss)

val d = Seq( Row(1, Row(1,2)), Row(2, Row(2,3)), Row(2, null) ) 

val rd = sc.parallelize(d)
val df = spark.createDataFrame(rd, s)

Now, df.select($"b").show results in 现在, df.select($"b").show结果

+-----+
| b   |
+-----+
|[1,2]|
|[2,3]|
| null|
+-----+

My question is how can I provide a default value (say [0,0] ) using coalesce ? 我的问题是如何使用coalesce提供默认值(比如[0,0] )?

You can use the struct function, passing two lit(0) values named to match the names of the struct you already have: 您可以使用struct函数,传递两个命名的lit(0)值以匹配您已有的struct的名称:

df.select(coalesce($"b", struct(lit(0).as("x"), lit(0).as("y"))))
  .show()

// +---------------------------------------+
// |coalesce(b, struct(0 AS `x`, 0 AS `y`))|
// +---------------------------------------+
// |                                  [1,2]|
// |                                  [2,3]|
// |                                  [0,0]|
// +---------------------------------------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM