I have a data frame and I want to save in json. My data frame contains a json column. When I save the data frame into json it saves as string field instead of json field.
spark version : 2.4.0 language: scala
dataframe
+--------+---------------------------------+
|id | jsoncolumn |
+--------+---------------------------------+
|1000 | [{"A": 10}, {"A": 20, "B": 50}] |
+--------+---------------------------------+
when I use df.write.json("path")
I am getting below output. jsoncolumn saves as string
{
"id": 1000,
"jsoncolumn": "[{\"A\": 10}, {\"A\": 20, \"B\": 50}]"
}
expected output
{
"id": 1000,
"jsoncolumn": [
{
"A": 10
},
{
"A": 20,
"B": 50
}
]
}
You can convert StringType to StructType before writing as below
val value = df.first().getAs[String]("jsonColumn")
df1.withColumn("jsonColumn", from_json($"jsonColumn", schema_of_json(value)))
.write.json("output/test")
Output:
{
"id": "1000",
"jsonColumn": [
{
"A": 10
},
{
"A": 20,
"B": 50
}
]
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.