I am using pyspark and i want to convert a spark dataframe into a specific file json. the Dataframe is like this:
| Key | desc | value |
|:---- |:----:| -----:|
| 12345| type | AA |
| 12345| id | q1w2e3|
| 98765| type | BB |
| 98765| id | z1x2c3|
I need to convert it into a json like this:
{
"12345": {
"type":"AA,
"id":"q1w2e3"
},
"98765":{
"type":"BB",
"id":"z1x2c3"
}
}
Any idea? Thank you
First collect the dataframe
Output = df.collect()
if you try to print the “Output” you will get List of Row Tuple something like this
[Row(key:1234,desc:type,value:AA)…..]
Now iterate over this list using for loop and Create dictionary and assign these value you can directly access them like this.
For row in Output:
dict[key] = row[key]
once the dictionary is create then you can use Json.dumps(dict)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.