I am reading a streaming data in pyspark dataframe, the data contains few fields which are present in every data/request. I want to exact those fields and create a dataframe column for it and want to store the rest of fields as map in another dataframe column. I am not able to achieve it
If someone can help with it?
Example:
Sample Values :
{"event1":"Value","event2":"Value","event3":"Value","event4":"Value","event5":"Value","event6":"Value"}
{"event1":"Value","event2":"Value","event3":"Value","data1":"Value","data2":"Value","data3":"Value"}
Now suppose event1,event2,event3 is present in every row, so I want to extract it and make it as a separate dataframe column and rest of the fields as map of key values pairs which will be another dataframe.
You need to create a schema for your dataframe and use from_json
to convert it to StructType
in spark. Then you are able to select your specific event's and create another dataframe for other events.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.