简体   繁体   中英

How can I save a spark dataframe as a partition of a partitioned hive table

how to save a spark dataframe into one partition of a partitioned hive table?

raw_nginx_log_df.write.saveAsTable("raw_nginx_log")

the above way could overwrite the whole table but not a specific partition. although i can solve the problem by the following code , it is obviously not elegant.

raw_nginx_log_df.registerTempTable("tmp_table")
sql(s"INSERT OVERWRITE TABLE raw_nginx_log PARTITION (par= '$PARTITION_VAR')")

it seems that in stackoverflowc.com there is no similar questions asked ever before!

YourDataFrame.write.format("parquet").option("/pathHiveLocation").mode(SaveMode.Append).partitionBy("partitionCol").saveAsTable("YourTable")

For parquet files/tables. You may customize it as per your requirement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM