簡體   English   中英

將列表轉換為數據幀 spark scala

[英]Convert List into dataframe spark scala

我有一個包含 30 多個字符串的列表。 如何將列表轉換為數據框。 我試過的:

例如

Val list=List("a","b","v","b").toDS().toDF()

Output :


+-------+
|  value|
+-------+
|a      |
|b      |
|v      |
|b      |
+-------+


Expected Output is 


  +---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
|  a|  b|  v|  a|
+---+---+---+---+

對此有任何幫助。

List("a","b","c","d")表示具有一個字段的記錄,因此結果集在每一行中顯示一個元素。

要獲得預期的輸出,該行中應包含四個字段/元素。 因此,我們將列表包裝為List(("a","b","c","d"))表示一行,有四個字段。 以類似的方式,具有兩行的列表為List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))

scala> val list = sc.parallelize(List(("a", "b", "c", "d"))).toDF()
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]

scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
|  a|  b|  c|  d|
+---+---+---+---+


scala> val list = sc.parallelize(List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))).toDF
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]

scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
| a1| b1| c1| d1|
| a2| b2| c2| d2|
+---+---+---+---+

為了使用 toDF,我們必須導入

import spark.sqlContext.implicits._

請參考以下代碼

val spark = SparkSession.
builder.master("local[*]")
  .appName("Simple Application")
.getOrCreate()

import spark.sqlContext.implicits._

val lstData = List(List("vks",30),List("harry",30))
val mapLst = lstData.map{case List(a:String,b:Int) => (a,b)}
val lstToDf = spark.sparkContext.parallelize(mapLst).toDF("name","age")
lstToDf.show

val llist = Seq(("bob", "2015-01-13", 4), ("alice", "2015-04- 23",10)).toDF("name","date","duration")
llist.show

這將:

val data = List(("Value1", "Cvalue1", 123, 2254, 22),("Value1", "Cvalue2", 124, 2255, 23));
val df = spark.sparkContext.parallelize(data).toDF("Col1", "Col2", "Expend1", "Expend2","Expend3");
val cols=Array("Expend1","Expend2","Expend3");
val df1=df
        .withColumn("keys",lit(cols))
        .withColumn("values",array($"Expend1",$"Expend2",$"Expend3"))
        .select($"col1",$"col2",explode_outer(map_from_arrays($"keys", $"values")))
        .show(false)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM