[英]Break JSON list of values into rows in a SNOWFLAKE database table
这是一种方法。 首先去掉 [" 和 ]" 因为 city 列中的双引号不包含单个数组元素而是包含所有元素,然后将字符串标记化并使用 strtok_to_array 将其作为真实数组返回,然后将数组元素展平以分隔行并将行(城市)横向连接回记录的其余部分。
with data as
(select 'A' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad"]' as city
union all
select 'B' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad,Delhi"]' as city
union all
select 'C' as name, 'F' as gender, '["cancelled"]' as orders, '["Mumbai,Pune"]' as city
union all
select 'D' as name, 'M' as gender, '["pending"]' as orders, '["cochi"]' as city
)
, data2 as
( select d.name
, d.gender
, replace(replace(d.orders,'["',''),'"]','') as orders
, strtok_to_array(replace(replace(city,'["',''),'"]',''),',') as city
from data d
)
select d2.name
, d2.gender
, d2.orders
, replace(c.value,'"','') as city
from data2 d2
, lateral flatten(input => d2.city) c;
如果city
字段中的数组实际上是一个包含单个逗号分隔字符串的数组,如["Cochi, Hyderabad"]
,则需要使用STRTOK_TO_ARRAY
进行第三级展平,如下所示:
create or replace table t1 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad, Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai, Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;
select
json:Name::varchar name,
json:Gender::varchar gender,
orders.value::varchar orders,
city.value::varchar city
from t1,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city_raw,
lateral flatten(strtok_to_array(city_raw.value, ', ')) city
;
/*
NAME GENDER ORDERS CITY
A M completed Cochi
A M completed Hyderabad
B M completed Cochi
B M completed Hyderabad
B M completed Delhi
C F cancelled Mumbai
C F cancelled Pune
D M pending Cochi
*/
LATERAL FLATTEN
是一种将对象 (JSON) 或数组中的值展平(扩展)为行并与父表中的原始行组合的习惯用法。
所以,上面的查询是:
orders
字段中的数组以将数组元素展开为输出中ORDERS
列的行city
字段中的数组展平,将数组元素展开为输出中CITY_RAW
中间列的行CITY_RAW
列中提取的逗号分隔字符串拆分为城市名称,并将字符串的中间数组展平为CITY
列中的行这种方式使您无需任何复杂的字符串操作即可获得所需的输出。
但是,我怀疑存储在 JSON 中的数组实际上就像["Cochi", "Hyderabad"]
,每个城市名称用双引号引用的城市名称数组。
在这种情况下,查询将很简单; 您只需要通过对每个字段使用LATERAL FLATTEN
来展平orders
字段和city
字段中的数组。
create or replace table t2 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad", "Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai", "Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;
select
json:Name::varchar name,
json:Gender::varchar gender,
orders.value::varchar orders,
city.value::varchar city
from t2,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city
;
/*
NAME GENDER ORDERS CITY
A M completed Cochi
A M completed Hyderabad
B M completed Cochi
B M completed Hyderabad
B M completed Delhi
C F cancelled Mumbai
C F cancelled Pune
D M pending Cochi
*/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.