[英]Query to extract ids from a deeply nested json array object in Presto
I'm using Presto and trying to extract all 'id' from 'source'='dd' from a nested json structure as following. 我正在使用Presto并尝试从嵌套json结构中的'source'='dd'中提取所有'id',如下所示。
{
"results": [
{
"docs": [
{
"id": "apple1",
"source": "dd"
},
{
"id": "apple2",
"source": "aa"
},
{
"id": "apple3",
"source": "dd"
}
],
"group": 99806
}
]
}
expected to extract the ids [apple1, apple3] into a column in Presto I am wondering what is the right way to achieve this in Presto Query? 希望将ID [apple1,apple3]提取到Presto的一列中,我想知道在Presto Query中实现此目的的正确方法是什么?
If your data has a regular structure as in the example you posted, you can use a combination of parsing the value as JSON , casting it to a structured SQL type (array/map/row) and the using array processing functions to filter , transform and extract the elements you want: 如果数据具有发布示例中所示的常规结构,则可以结合使用以下两种方法: 将值解析为JSON ,将其强制转换为结构化的SQL类型(数组/地图/行)以及使用数组处理函数进行过滤 , 转换并提取所需的元素:
WITH data(value) AS (VALUES '{
"results": [
{
"docs": [
{
"id": "apple1",
"source": "dd"
},
{
"id": "apple2",
"source": "aa"
},
{
"id": "apple3",
"source": "dd"
}
],
"group": 99806
}
]
}'),
parsed(value) AS (
SELECT cast(json_parse(value) AS row(results array(row(docs array(row(id varchar, source varchar)), "group" bigint))))
FROM data
)
SELECT
transform( -- extract the id from the resulting docs
filter( -- filter docs with source = 'dd'
flatten( -- flatten all docs arrays into a single doc array
transform(value.results, r -> r.docs) -- extract the docs arrays from the result array
),
doc -> doc.source = 'dd'),
doc -> doc.id)
FROM parsed
The query above produces: 上面的查询产生:
_col0
------------------
[apple1, apple3]
(1 row)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.