简体   繁体   中英

Break JSON list of values into rows in a SNOWFLAKE database table

I am having a table as per the below screenshot which is basically a JSON and is parsed to get the output below, now I want the list of values in column City and orders to be split into rows.

Could someone please help me

在此处输入图片说明

Desired Output is as below

在此处输入图片说明

Here is one way to do it. First get rid of the [" and ]" as the double quotes in city column don't enclose single array elements but all of them, then tokenize the string and return it as real array with strtok_to_array, then flatten the array elements to separate rows and lateral join the rows (cities) back to the rest of the record.

with data as
(select 'A' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad"]' as city
union all
 select 'B' as name, 'M' as gender, '["completed"]' as orders, '["Cochi,Hyderabad,Delhi"]' as city
union all
 select 'C' as name, 'F' as gender, '["cancelled"]' as orders, '["Mumbai,Pune"]' as city
union all
 select 'D' as name, 'M' as gender, '["pending"]' as orders, '["cochi"]' as city
)
, data2 as 
( select d.name
 , d.gender
 , replace(replace(d.orders,'["',''),'"]','') as orders
 , strtok_to_array(replace(replace(city,'["',''),'"]',''),',')  as city
 from data d
)
 select d2.name
 , d2.gender
 , d2.orders
 , replace(c.value,'"','') as city
 from data2 d2
  , lateral flatten(input => d2.city) c;

If the array in the city field is actually an array containing a single comma-separated string like ["Cochi, Hyderabad"] , it needs the third level of flattening with STRTOK_TO_ARRAY as below:

create or replace table t1 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi, Hyderabad, Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai, Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;

select
    json:Name::varchar name,
    json:Gender::varchar gender,
    orders.value::varchar orders,
    city.value::varchar city
from t1,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city_raw,
lateral flatten(strtok_to_array(city_raw.value, ', ')) city
;
/*
NAME    GENDER  ORDERS  CITY
A   M   completed   Cochi
A   M   completed   Hyderabad
B   M   completed   Cochi
B   M   completed   Hyderabad
B   M   completed   Delhi
C   F   cancelled   Mumbai
C   F   cancelled   Pune
D   M   pending Cochi
*/

LATERAL FLATTEN is an idiom to flatten (expand) the values in an object (JSON) or an array to rows and combine with the original row in the parent table.

So, the query above does:

  • Flatten the array in the orders field to expand the array elements into rows of the ORDERS column in the output
  • Flatten the array in the city field to expand the array elements into rows of the CITY_RAW intermediate column in the output
  • Split the comma-separated string extracted in the CITY_RAW column into city names and flatten the intermediate array of the strings into rows in the CITY column

This way allows you to achieve the desired output without any complex string manipulation.


However, I suspect the array stored in the JSON is actually like ["Cochi", "Hyderabad"] , the array of the city names quoted by double quotations for each city name.

In this case, the query will be simple; You just need to flatten the arrays in the orders field and the city field by using LATERAL FLATTEN for each.

create or replace table t2 (json variant) as
select parse_json('{"Name": "A", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad"]}')
union all
select parse_json('{"Name": "B", "Gender": "M", "orders": ["completed"], "city": ["Cochi", "Hyderabad", "Delhi"]}')
union all
select parse_json('{"Name": "C", "Gender": "F", "orders": ["cancelled"], "city": ["Mumbai", "Pune"]}')
union all
select parse_json('{"Name": "D", "Gender": "M", "orders": ["pending"], "city": ["Cochi"]}')
;


select
    json:Name::varchar name,
    json:Gender::varchar gender,
    orders.value::varchar orders,
    city.value::varchar city
from t2,
lateral flatten(json:orders) orders,
lateral flatten(json:city) city
;
/*
NAME    GENDER  ORDERS  CITY
A   M   completed   Cochi
A   M   completed   Hyderabad
B   M   completed   Cochi
B   M   completed   Hyderabad
B   M   completed   Delhi
C   F   cancelled   Mumbai
C   F   cancelled   Pune
D   M   pending Cochi
*/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM