简体   繁体   中英

How to add extra metadata when writing to parquet files using spark

Looks like spark by default write "org.apache.spark.sql.parquet.row.metadata" to parquet file footer. However, what if I want to write some random metadata(such as version=123) to a parquet file produced by spark?

This does NOT work:

df.write().option("version","123").parquet("somefile.parquet");

And I'm using spark version 1.6.2

Column level metadata, yes see my comment.

Table level comments/user metadata: See https://issues.apache.org/jira/browse/SPARK-10803

Sadly, not yet

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM