简体   繁体   中英

how do i initialise an empty Dataset/dataframe

I need to build my results stage by stage so I want to start with an empty Dataset/dataframe

Dataset<Row> output = spark.emptyDataFrame();
output = output.withColumn("Test", lit("Hello"));
output.show();

The above produces

+----+
|Test|
+----+
+----+

It doesn't apply the expected value as the column is created And same thing happens when I try to extend the Dataset/dataframe with another column

output = output.withColumn("Test2", lit("Hello 2"));
output.show();

which above produces

+----+-----+
|Test|Test2|
+----+-----+
+----+-----+

Obviously the lit functions above will be replaced with my real field calculations But I don't understand why the above is not working as expected

Appreciate any explanation/correction

import spark.implicits._

val output = Seq(
    ("hello") 
  ).toDF("test")

+-----+
|Test |
+-----+
|hello|
+-----+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM