简体   繁体   English

如何将 scala 地图数组转换为 Spark df

[英]How to convert scala Array of Maps into Spark df

I having scala array of type Array[Map[String,String]] and i want to convert it into spark df.我有Array[Map[String,String]]类型的 scala 数组,我想将其转换为 spark df。

input:- Array(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )输入:- Array(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )

expected output:-预期 output:-

Spark df火花df

col1 col1 col2 col2 col3 col3
val1 val1 NA不适用 NA不适用
val1 val1 val2 val2 NA不适用
NA不适用 NA不适用 val3 val3

What is best way to do this?最好的方法是什么?

Having:有:

val input = Seq(Map("col1" -> "val1"), Map("col2" -> "val2", "col1" -> "val1"), Map("col3" -> "val3") )

you can select values and keys like:您可以 select 值和键,如:

val values = input.map(_.values.toSeq)
val keys = input.flatMap(_.keys.toSeq).distinct

and then use spark implicit然后使用 spark 隐式

import spark.implicits._
values.toDF(keys: _*)

I also THINK you can also do it this way (not sure though)我也认为你也可以这样做(虽然不确定)

import spark.implicits._
input.toDF()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM