I having scala array of type Array[Map[String,String]]
and i want to convert it into spark df.
input:- Array(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )
expected output:-
Spark df
col1 | col2 | col3 |
---|---|---|
val1 | NA | NA |
val1 | val2 | NA |
NA | NA | val3 |
What is best way to do this?
Having:
val input = Seq(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )
you can select values and keys like:
val values = input.map(_.values.toSeq)
val keys = input.flatMap(_.keys.toSeq).distinct
and then use spark implicit
import spark.implicits._
values.toDF(keys: _*)
I also THINK you can also do it this way (not sure though)
import spark.implicits._
input.toDF()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.