繁体   English   中英

Athena/Presto:SQL 用于从扁平行生成嵌套映射/结构数组

[英]Athena/Presto: SQL for producing array of nested maps/structs from flat rows

假设我有这样的数据:

with users (user_id, name) as (
    values (1, 'Alice'),
           (2, 'Bob'),
           (3, 'Charlie')
), todos (todo_id, user_id, title) as (
    values (1, 1, 'todo 1'),
           (2, 1, 'todo 2'),
           (3, 2, 'todo 3'),
           (4, 3, 'todo 4')
)

我想从如下示例的查询中生成 JSON :

[
    {
        "user_id": 1,
        "name": "Alice",
        "todos": [
            {
                "todo_id": 1,
                "title": "todo 1"
            },
            {
                "todo_id": 2,
                "title": "todo 2"
            }
        ]
    },
    {
        "user_id": 2,
        "name": "Bob",
        "todos": [
            {
                "todo_id": 3,
                "title": "todo 3"
            }
        ]
    },
    {
        "user_id": 3,
        "name": "Charlie",
        "todos": [
            {
                "todo_id": 4,
                "title": "todo 4"
            }
        ]
    }
]

有什么办法可以在 Athena/Presto 本地做到这一点?

感谢我从@gurustrom 找到的答案,我已经有点接近了

  • 制作只有 1 个属性的嵌套 object 似乎相当容易
  • 您可以使用ARRAY[]map_from_entries()创建类似结构的类型,但值必须全部相同,这意味着我不明白如何创建混合值 object ,例如:
    • { "user_id": 1, "name": "Foo", "todos": [{}] }

以下 SQL 有点走上正轨,但会产生错误的 output 形状,并且我尝试过的组合都失败了:

[{"2":{"todos":[{"id":3}]}},{"1":{"todos":[{"id":1},{"id":2}]}},{"3":{"todos":[{"id":4}]}}]
select
    cast(array_agg(res) as JSON) as result
    from
    (
        select map_agg(user_id, m) as res
        from
            (
                select
                    user_id,
                    map_agg('todos', todos) as m
                from
                    (
                        select
                            user_id,
                            array_agg(todo) as todos
                        from
                            (
                                select
                                    user_id,
                                    map_agg('id', todo_id) as todo
                                from
                                    todos
                                group by
                                    user_id,
                                    todo_id
                            ) t
                        group by
                            user_id
                    ) t
                group by
                    user_id
            ) t
        group by
            user_id
    ) t
group by
    true;

您可以利用 map 函数和一些转换为 json:

-- sample data
with users (user_id, name) as (
    values (1, 'Alice'),
           (2, 'Bob'),
           (3, 'Charlie')
),
todos (todo_id, user_id, title) as (
    values (1, 1, 'todo 1'),
           (2, 1, 'todo 2'),
           (3, 2, 'todo 3'),
           (4, 3, 'todo 4')
),
-- query
preprocess as (
    select cast(u.user_id as json) user_id,
           cast(max(u.name) as json) name,
           array_agg(map(array['todo_id', 'title'], array[cast(t.todo_id as json),cast(t.title as json)])) todos
    from users u
             join todos t on t.user_id = u.user_id
    group by u.user_id)

select cast(
               map(array['user_id', 'name', 'todos'], array[user_id, name, cast(todos as json)])
           as json)
from preprocess;

Output:

_col0
{"name":"Charlie","todos":[{"title":"todo 4","todo_id":4}],"user_id":3}
{"name":"Alice","todos":[{"title":"todo 1","todo_id":1},{"title":"todo 2","todo_id":2}],"用户 ID":1}
{"name":"Bob","todos":[{"title":"todo 3","todo_id":3}],"user_id":2}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM