简体   繁体   中英

How to convert scala Array of Maps into Spark df

I having scala array of type Array[Map[String,String]] and i want to convert it into spark df.

input:- Array(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )

expected output:-

Spark df

col1 col2 col3
val1 NA NA
val1 val2 NA
NA NA val3

What is best way to do this?

Having:

val input = Seq(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )

you can select values and keys like:

val values = input.map(_.values.toSeq)
val keys = input.flatMap(_.keys.toSeq).distinct

and then use spark implicit

import spark.implicits._
values.toDF(keys: _*)

I also THINK you can also do it this way (not sure though)

import spark.implicits._
input.toDF()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM