简体   繁体   中英

How to write Spark data frame to xml file?

Sample :

scala> Frame.show()

|year| make|model|             comment|blank|
|2012|Tesla|    S|          No comment|    R|
|1997| Ford| E350|Go get one now th...|    L|
|2015|Chevy| Volt|                 Try|    M|

to

<item>
    <'year'>2012<'/year'>
    <'make'>Tesla<'/make'>
    <'model'>S<'/mode'>
</item>

The simplest approach is to use XML writer from spark-xml :

val path: String = ???
df.write.format("com.databricks.spark.xml")
  .option("rootTag", "items")
  .option("rowTag", "item")
  .save(path)

If for some reason it doesn't fit your needs you can dump records individually and saveAsTextFile :

def dumpXML(row: Row): String = ???
df.rdd.map(dumpXML).saveAsTextFile(path)

You can add root element using for example mapPartitions .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM