Write spark dataframe into existing parquet hive table

Question

Want to write spark dataframe into existing parquet hive table. I am able to do it using df.write.mode("append").insertI to("myexistinghivetable") but if I check through file system I could see spark files are landed with .c000 extension. What those files mean? And how to write dataframe into parquet hive table.

Answer 1

You can save dataframe as parquest at location where your hive table is referring after that you can alter tables in hive

You can do like this

df.write.mode("append").parquet("HDFS directory path")

Answer 2

我们可以使用df.write.partitionBy("mypartitioncols").format("parquet").mode(SaveMode.Append).saveAsTable("hivetable") 。

Write spark dataframe into existing parquet hive table

Question

2 answers

solution1
0 2018-06-11 17:59:06

solution2
0 ACCPTED 2018-08-04 18:20:14

Write spark dataframe into existing parquet hive table

Question

2 answers

solution1 0 2018-06-11 17:59:06

solution2 0 ACCPTED 2018-08-04 18:20:14

solution1
0 2018-06-11 17:59:06

solution2
0 ACCPTED 2018-08-04 18:20:14