简体繁体 English

使用 scala spark 将具有 json 值的列转换为数据帧

[英]convert a column with json value to a data frame using scala spark

原文 2019-10-11 17:22:16 8 1 json/ scala/ dataframe/ apache-spark/ normalization

I found several helpful answers but that were all converting son file to df, in my case, I have a df with columns with son in them, like this:我找到了几个有用的答案，但都是将儿子文件转换为 df，在我的情况下，我有一个 df，其中包含有儿子的列，如下所示：

s-timestamp : 2019-10-10 s-时间戳：2019-10-10

content : {"META":{"testA":"1","TABLENAME":"some_table_name"},"PINACOLADA":{"sampleID":"0","itemInserted":"2019-10-10","sampleType":"BASE",}"内容：{"META":{"testA":"1","TABLENAME":"some_table_name"},"PINACOLADA":{"sampleID":"0","itemInserted":"2019-10-10", "sampleType":"BASE",}"

I need to normalize the content column, how can I do that.我需要规范化内容列，我该怎么做。

1 个解决方案

Welcome.欢迎。 There are a few ways of dealing with JSON strings in Spark DF columns.有几种方法可以处理 Spark DF 列中的 JSON 字符串。 You can use functions like get_json_object to extract specific fields from your JSON or from_json to transform the field into a StructType with a given schema.您可以使用 get_json_object 之类的函数从from_json get_json_object提取特定字段，以将字段转换为具有给定架构的StructType 。 Another option is to use spark.read.json to parse and create a separate dataframe from the column's contents.另一种选择是使用spark.read.json从列的内容中解析并创建一个单独的 dataframe。 Have a look at my solution here and let me know if it helps.在这里查看我的解决方案，如果有帮助，请告诉我。

数据帧列scala中的火花流JSON值 - spark streaming JSON value in dataframe column scala

使用 spark/scala 按照 json 文件中首先列出的列的顺序将 json 转换为数据帧 - Convert json to dataframe in order of which column listed first in json file using spark/scala

将 Spark 的数据帧的 Json 列转换为 Object 的数组 - Convert a Spark's Data-frame's Json column to Array of Object

无法使用 Spark/Scala 从 JSON 嵌套键值对创建列和值 - Unable to create the column and value from the JSON nested key Value Pair using Spark/Scala

spark scala - 从 json jdbc 列中获取值 - spark scala - get a value from json jdbc column

在 Spark Scala 中创建 JSON 列 - Create JSON column in Spark Scala

展平任何嵌套的 json 字符串并使用 spark scala 转换为数据帧 - Flatten any nested json string and convert to dataframe using spark scala

使用Spark Scala将Array [Byte]转换为JSON格式 - Convert Array[Byte] to JSON format using Spark Scala

使用Spark Scala将嵌套JSON中的字符串变量转换为datetime - Convert a string variable in nested JSON to datetime using Spark Scala

如何使用Scala语言将Spark RDD转换为JSON - How to Convert Spark RDD into JSON using Scala Language

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 数据帧列scala中的火花流JSON值 - spark streaming JSON value in dataframe column scala 使用 spark/scala 按照 json 文件中首先列出的列的顺序将 json 转换为数据帧 - Convert json to dataframe in order of which column listed first in json file using spark/scala 将 Spark 的数据帧的 Json 列转换为 Object 的数组 - Convert a Spark's Data-frame's Json column to Array of Object 无法使用 Spark/Scala 从 JSON 嵌套键值对创建列和值 - Unable to create the column and value from the JSON nested key Value Pair using Spark/Scala spark scala - 从 json jdbc 列中获取值 - spark scala - get a value from json jdbc column 在 Spark Scala 中创建 JSON 列 - Create JSON column in Spark Scala 展平任何嵌套的 json 字符串并使用 spark scala 转换为数据帧 - Flatten any nested json string and convert to dataframe using spark scala 使用Spark Scala将Array [Byte]转换为JSON格式 - Convert Array[Byte] to JSON format using Spark Scala 使用Spark Scala将嵌套JSON中的字符串变量转换为datetime - Convert a string variable in nested JSON to datetime using Spark Scala 如何使用Scala语言将Spark RDD转换为JSON - How to Convert Spark RDD into JSON using Scala Language

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM