简体   繁体   English

使用 scala spark 将具有 json 值的列转换为数据帧

[英]convert a column with json value to a data frame using scala spark

I found several helpful answers but that were all converting son file to df, in my case, I have a df with columns with son in them, like this:我找到了几个有用的答案,但都是将儿子文件转换为 df,在我的情况下,我有一个 df,其中包含有儿子的列,如下所示:

s-timestamp : 2019-10-10 s-时间戳:2019-10-10

content : {"META":{"testA":"1","TABLENAME":"some_table_name"},"PINACOLADA":{"sampleID":"0","itemInserted":"2019-10-10","sampleType":"BASE",}"内容:{"META":{"testA":"1","TABLENAME":"some_table_name"},"PINACOLADA":{"sampleID":"0","itemInserted":"2019-10-10", "sampleType":"BASE",}"

I need to normalize the content column, how can I do that.我需要规范化内容列,我该怎么做。

Welcome.欢迎。 There are a few ways of dealing with JSON strings in Spark DF columns.有几种方法可以处理 Spark DF 列中的 JSON 字符串。 You can use functions like get_json_object to extract specific fields from your JSON or from_json to transform the field into a StructType with a given schema.您可以使用 get_json_object 之类的函数从from_json get_json_object提取特定字段,以将字段转换为具有给定架构的StructType Another option is to use spark.read.json to parse and create a separate dataframe from the column's contents.另一种选择是使用spark.read.json从列的内容中解析并创建一个单独的 dataframe。 Have a look at my solution here and let me know if it helps.这里查看我的解决方案,如果有帮助,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM