how to save a spark dataframe into one partition of a partitioned hive table?
raw_nginx_log_df.write.saveAsTable("raw_nginx_log")
the above way could overwrite the whole table but not a specific partition. although i can solve the problem by the following code , it is obviously not elegant.
raw_nginx_log_df.registerTempTable("tmp_table")
sql(s"INSERT OVERWRITE TABLE raw_nginx_log PARTITION (par= '$PARTITION_VAR')")
it seems that in stackoverflowc.com there is no similar questions asked ever before!
YourDataFrame.write.format("parquet").option("/pathHiveLocation").mode(SaveMode.Append).partitionBy("partitionCol").saveAsTable("YourTable")
For parquet files/tables. You may customize it as per your requirement.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.