I am writing a Spark Application that saves log data into a directory /logroot
.
My code is
myDF.mode('overwrite').partitionBy('date','site').save('logroot')
I want to use the overwrite
mode in order to re-process many times a week all the daily data.
My concern is that overwrite
cleans all the logroot
directory and not only the partitions involved.
How can I solve this problem?
At the moment of writing the best solution seems:
append mode
Thanks to all for the help and hope Spark guys will provide a more elegant solution option.
Roberto
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.