简体   繁体   中英

How to map object/json array in Snowflake SQL / DBT Macro?

id some_attribute json_array
1 "abc" [ { attr: 'apple'}, { attr: 'banana' } ]

How to get the get rid of attr in json_array so that the table results into something like table below?

id some_attribute string_array
1 "abc" [ 'apple', 'banana' ]

Use case is during the cleaning stage of the data to make further processing and analysis simpler in later stages of the pipeline.

Thx for the help!

One option is to FLATTEN the json array, then construct the string array from the values.

For example

WITH data AS(
  SELECT 1 id, 'abc' as some_attribute
, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT 
  id
, some_attribute
, ARRAY_AGG(value:attr::string) WITHIN GROUP( ORDER BY index) as string_array
FROM
  data
, TABLE(FLATTEN(input => json_array))
GROUP BY
  id
, some_attribute

which returns

ID|SOME_ATTRIBUTE|STRING_ARRAY      |
--+--------------+------------------+
 1|abc           |["apple","banana"]|

Another option is to create a JavaScript UDF. For example

CREATE OR REPLACE FUNCTION ARRAY_JSON_VALUES("a" ARRAY, "attr" STRING) 
RETURNS ARRAY 
LANGUAGE JAVASCRIPT RETURNS NULL ON NULL INPUT IMMUTABLE
AS 
$$ 
  return a.map(e => e[attr]);
$$

then

WITH data AS(
  SELECT 1 id, 'abc' as some_attribute, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT 
  id
, some_attribute
, ARRAY_JSON_VALUES(json_array,'attr') as string_array
FROM
  data

again returns

ID|SOME_ATTRIBUTE|STRING_ARRAY      |
--+--------------+------------------+
 1|abc           |["apple","banana"]|

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM