簡體   English   中英

如何將地圖的RDD轉換為數據幀

[英]How to convert an RDD of Maps to dataframe

我有地圖的RDD,我想將其轉換為數據幀這是RDD的輸入格式

val mapRDD: RDD[Map[String, String]] = sc.parallelize(Seq(
   Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
   Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
   Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
   Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
   Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))

有沒有辦法轉換成數據幀,如

 val df=mapRDD.toDf

df.show

empid,  empName,    depId
12      Rohan       201
13      Ross        201
14      Richard     401
15      Michale     501
16      John        701

您可以輕松將其轉換為Spark DataFrame:

這是一個可以解決問題的代碼:

val mapRDD= sc.parallelize(Seq(
   Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
   Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
   Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
   Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
   Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))

val columns=mapRDD.take(1).flatMap(a=>a.keys)

val resultantDF=mapRDD.map{value=>
      val list=value.values.toList
      (list(0),list(1),list(2))
      }.toDF(columns:_*)

resultantDF.show()

輸出是:

+-----+-------+-----+
|empid|empName|depId|
+-----+-------+-----+
|   12|  Rohan|  201|
|   13|   Ross|  201|
|   14|Richard|  401|
|   15|Michale|  501|
|   16|   John|  701|
+-----+-------+-----+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM