繁体   English   中英

将 JSON 值列表分解为 SNOWFLAKE 数据库表中的行

[英]Break JSON list of values into rows in a SNOWFLAKE database table

我按照下面的屏幕截图有一个表格,它基本上是一个 JSON 并被解析以获得下面的输出,现在我想要列 City 和 orders 中的值列表被拆分成行。

有人可以帮我吗

在此处输入图片说明

所需的输出如下

在此处输入图片说明

这是一种方法。 首先去掉 [" 和 ]" 因为 city 列中的双引号不包含单个数组元素而是包含所有元素,然后将字符串标记化并使用 strtok_to_array 将其作为真实数组返回,然后将数组元素展平以分隔行并将行(城市)横向连接回记录的其余部分。

with data as
(select 'A' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad"]' as city
union all
 select 'B' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad,Delhi"]' as city
union all
 select 'C' as name, 'F' as gender, '["cancelled"]' as orders, '["Mumbai,Pune"]' as city
union all
 select 'D' as name, 'M' as gender, '["pending"]' as orders, '["cochi"]' as city
)
, data2 as 
( select d.name
 , d.gender
 , replace(replace(d.orders,'["',''),'"]','') as orders
 , strtok_to_array(replace(replace(city,'["',''),'"]',''),',')  as city
 from data d
)
 select d2.name
 , d2.gender
 , d2.orders
 , replace(c.value,'"','') as city
 from data2 d2
  , lateral flatten(input => d2.city) c;

如果city字段中的数组实际上是一个包含单个逗号分隔字符串的数组,如["Cochi, Hyderabad"] ,则需要使用STRTOK_TO_ARRAY进行第三级展平,如下所示:

create or replace table t1 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad, Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai, Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;

select
    json:Name::varchar name,
    json:Gender::varchar gender,
    orders.value::varchar orders,
    city.value::varchar city
from t1,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city_raw,
lateral flatten(strtok_to_array(city_raw.value, ', ')) city
;
/*
NAME    GENDER  ORDERS  CITY
A   M   completed   Cochi
A   M   completed   Hyderabad
B   M   completed   Cochi
B   M   completed   Hyderabad
B   M   completed   Delhi
C   F   cancelled   Mumbai
C   F   cancelled   Pune
D   M   pending Cochi
*/

LATERAL FLATTEN是一种将对象 (JSON) 或数组中的值展平(扩展)为行并与父表中的原始行组合的习惯用法。

所以,上面的查询是:

  • 展平orders字段中的数组以将数组元素展开为输出中ORDERS列的行
  • city字段中的数组展平,将数组元素展开为输出中CITY_RAW中间列的行
  • CITY_RAW列中提取的逗号分隔字符串拆分为城市名称,并将字符串的中间数组展平为CITY列中的行

这种方式使您无需任何复杂的字符串操作即可获得所需的输出。


但是,我怀疑存储在 JSON 中的数组实际上就像["Cochi", "Hyderabad"] ,每个城市名称用双引号引用的城市名称数组。

在这种情况下,查询将很简单; 您只需要通过对每个字段使用LATERAL FLATTEN来展平orders字段和city字段中的数组。

create or replace table t2 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad", "Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai", "Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;


select
    json:Name::varchar name,
    json:Gender::varchar gender,
    orders.value::varchar orders,
    city.value::varchar city
from t2,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city
;
/*
NAME    GENDER  ORDERS  CITY
A   M   completed   Cochi
A   M   completed   Hyderabad
B   M   completed   Cochi
B   M   completed   Hyderabad
B   M   completed   Delhi
C   F   cancelled   Mumbai
C   F   cancelled   Pune
D   M   pending Cochi
*/

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM