简体   繁体   English

Spark/Scala 将 [map of array] 转换为 [map of map]

[英]Spark/Scala transform [map of array] to [map of map]

I am looking to change the way data is stored in one of my dataframe's column.我希望更改数据存储在我的数据框列之一中的方式。 The columnn "content-value" has currently this type: “content-value”列目前有这种类型:

 |-- content-value: map (nullable = true)
 |    |-- key: integer
 |    |-- value: array (valueContainsNull = true)
 |    |    |-- element: string (containsNull = true)

And the data is currently stored like that:数据目前是这样存储的:

{4 -> [5191, 57, -46, POS2], 5 -> [5413, 56, 48, POS2], 2 -> [5421, -59, 47, POS2], 1 -> [5237, -59, -47, POS2], 3 -> [5153, -10, 42, POS1]} 

I would like to change that to a map of map that would look like:我想将其更改为 map 的 map,它看起来像:

{4 -> {value -> 5191, x -> 57, y -> -46, pos -> POS2}, 5 -> {value -> 5413, x -> 56, y -> 48, pos -> POS2}, 2 -> {value -> 5421, x -> -59, y -> 47, pos -> POS2}, 1 -> {value -> 5237, x -> -59, y -> -47, pos -> POS2}, 3 -> {value -> 5153, x -> -10, y -> 42, pos -> POS1}} 

I've tried creating a new column with the keys ["value", "x", "y", "pos"] and using map_from_array without success.我尝试使用键["value", "x", "y", "pos"]创建一个新列并使用 map_from_array 但没有成功。

Would love some help !会喜欢一些帮助!

With dataset:使用数据集:

import spark.implicits._

case class Value(value: String, x: String, y: String, pos: String)

val ds = spark.createDataset[Map[Int, Array[String]]](Seq(Map(4 -> Array("5191", "57", "-46", "POS2"))))

val dsFinal = 
  ds.map(el => el.flatMap {
     case (key, value) => Map(key -> Value(value(0), value(1), value(2), value(3)))})

It gives:它给:

+----------------------------+
|value                       |
+----------------------------+
|{4 -> {5191, 57, -46, POS2}}|
+----------------------------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM