When I run the following on the spark-shell, I get a dataframe:
scala> val df = Seq(Array(1,2)).toDF("a")
scala> df.show(false)
+------+
|a |
+------+
|[1, 2]|
+------+
But when I run the following to create a dataframe with two columns:
scala> val df1 = Seq(Seq(Array(1,2)),"jf").toDF("a","b")
<console>:23: error: value toDF is not a member of Seq[Object]
val df1 = Seq(Seq(Array(1,2)),"jf").toDF("a","b")
I get the error:
Value toDF is not a member of Seq[Object].
How do I go about this? Is toDF only supported for sequences with primitive datatypes?
You need a Seq
of Tuple
for the toDF
method to work:
val df1 = Seq((Array(1,2),"jf")).toDF("a","b")
// df1: org.apache.spark.sql.DataFrame = [a: array<int>, b: string]
df1.show
+------+---+
| a| b|
+------+---+
|[1, 2]| jf|
+------+---+
Add more tuples for more rows:
val df1 = Seq((Array(1,2),"jf"), (Array(2), "ab")).toDF("a","b")
// df1: org.apache.spark.sql.DataFrame = [a: array<int>, b: string]
df1.show
+------+---+
| a| b|
+------+---+
|[1, 2]| jf|
| [2]| ab|
+------+---+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.