简体   繁体   中英

Spark 2.4.0 writing empty dataframe to parquet AWS s3

After Spark 2.4.0 EMR job writes empty DF to AWS S3:

df
  .repartition(1)
  .write
  .mode(SaveMode.Append)
  .partitionBy(/* some_partitions */ )
  .parquet(target)

There is no output at target S3 location. However, this is not what I'd expect based on this resolved issue . There is no exception but also there are no metadata and no _success file in target folder.

Thanks in advance!

How about writing to core node's hdfs ? Do you see files written there ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM