繁体   English   中英

如何根据 id 将 spark dataframe 列的所有唯一值合并为单行并将列转换为 json 格式

[英]How to merge all unique values of a spark dataframe column into single row based on id and convert the column into json format

如何根据 id 将 spark dataframe 列的所有唯一值合并为单行并将列转换为 json 格式。

输入示例:

+---+------+-----------+
|id |gender|banner_desc|
+---+------+-----------+
|123|male  |banner1    |
|123|male  |banner2    |
|123|male  |banner3    |
|124|female|banner1    |
|124|female|banner2    |
|125|male  |banner1    |
|126|female|banner3    |
+---+------+-----------+

Output 示例:

+---+------+-------------------------------------------------------------+
|id |gender|banner_desc                                                  | 
+---+------+-------------------------------------------------------------+
|123|male  |[{"name":"banner1"}, {"name":"banner2"}, {"name":"banner3"}] |
|124|female|[{"name":"banner1"}, {"name":"banner2"}]                     |
|125|male  |[{"name":"banner1"}]                                         |
|126|female|[{"name":"banner3"}]                                         |
+---+------+-------------------------------------------------------------+

您可以使用to_jsoncollect_list(struct())获取 JSON 字符串:

val result = df.groupBy(
    "id","gender"
).agg(
    to_json(
        collect_list(
            struct(col("banner_desc").as("name"))
        )
    ).as("banner_desc")
)

result.show(false)
+---+------+----------------------------------------------------------+
|id |gender|banner_desc                                               |
+---+------+----------------------------------------------------------+
|124|female|[{"name":"banner1"},{"name":"banner2"}]                   |
|126|female|[{"name":"banner3"}]                                      |
|125|male  |[{"name":"banner1"}]                                      |
|123|male  |[{"name":"banner1"},{"name":"banner2"},{"name":"banner3"}]|
+---+------+----------------------------------------------------------+

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM