简体   繁体   English

使用 Java 将 JSON object 列提取到 Spark 中的多个列

[英]Extract JSON object column to multiple columns in Spark using Java

I'm using Java 11 with Spark 3.3.0.我将 Java 11 与 Spark 3.3.0 一起使用。 Let's say I have a DataFrame with two columns, id (a string identifier) and properties (a string containing a single JSON object, eg {"foo":123,"bar":"test"} ).假设我有一个 DataFrame 有两列, id (字符串标识符)和properties (包含单个 JSON ZA8CFDE6331BD59EB2AC96F8911C4B6636Z,例如{"foo":123,"bar":"test"} ) 的字符串。

How can I "explode" all the name-value pairs of the properties object into multiple columns, eg id , foo , bar .如何将属性 object 的所有名称-值对“分解”为多个列,例如idfoobar

I've seen solutions using select() .我见过使用select()的解决方案。 But then I would have to take care to select the existing columns, wouldn't I?但是,我将不得不注意 select 现有的列,不是吗? Is there a simpler way to simply add additional columns as needed in one command?有没有一种更简单的方法可以根据需要在一个命令中简单地添加其他列?

I know I can add a single column at a time if I already know what properties to expect, like this:我知道如果我已经知道预期的属性,我可以一次添加一列,如下所示:

df = df.withColumn("foo", get_json_object(col("properties"), "$.foo"))
df = df.withColumn("bar", get_json_object(col("properties"), "$.bar"))

Is there a way to add all the columns necessary for the properties JSON object in the properties column in one fell swoop, even without knowing ahead of time what properties will be in each JSON object? Is there a way to add all the columns necessary for the properties JSON object in the properties column in one fell swoop, even without knowing ahead of time what properties will be in each JSON object?

You can add dynamically the properties of your JSON object as Dataset columns in Java 11 and Spark 3.3.0 as below:您可以将 JSON object 的属性动态添加为 Java 11 和 Spark 3.3.0 中的数据集列,如下所示:

ds = ds.select("*", "properties.*");

Input:输入:

+---+---+------------+
|id |col|properties  |
+---+---+------------+
|1  |foo|{123, null} |
|2  |bar|{456, test2}|
+---+---+------------+

Output: Output:

+---+---+------------+---+-----+
| id|col|  properties|foo|  bar|
+---+---+------------+---+-----+
|  1|foo| {123, null}|123| null|
|  2|bar|{456, test2}|456|test2|
+---+---+------------+---+-----+

Hope this is what you need, good luck!希望这是你需要的,祝你好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM