id | some_attribute | json_array |
---|---|---|
1 | "abc" | [ { attr: 'apple'}, { attr: 'banana' } ] |
How to get the get rid of attr
in json_array
so that the table results into something like table below?
id | some_attribute | string_array |
---|---|---|
1 | "abc" | [ 'apple', 'banana' ] |
Use case is during the cleaning stage of the data to make further processing and analysis simpler in later stages of the pipeline.
Thx for the help!
One option is to FLATTEN
the json array, then construct the string array from the values.
For example
WITH data AS(
SELECT 1 id, 'abc' as some_attribute
, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT
id
, some_attribute
, ARRAY_AGG(value:attr::string) WITHIN GROUP( ORDER BY index) as string_array
FROM
data
, TABLE(FLATTEN(input => json_array))
GROUP BY
id
, some_attribute
which returns
ID|SOME_ATTRIBUTE|STRING_ARRAY |
--+--------------+------------------+
1|abc |["apple","banana"]|
Another option is to create a JavaScript UDF. For example
CREATE OR REPLACE FUNCTION ARRAY_JSON_VALUES("a" ARRAY, "attr" STRING)
RETURNS ARRAY
LANGUAGE JAVASCRIPT RETURNS NULL ON NULL INPUT IMMUTABLE
AS
$$
return a.map(e => e[attr]);
$$
then
WITH data AS(
SELECT 1 id, 'abc' as some_attribute, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT
id
, some_attribute
, ARRAY_JSON_VALUES(json_array,'attr') as string_array
FROM
data
again returns
ID|SOME_ATTRIBUTE|STRING_ARRAY |
--+--------------+------------------+
1|abc |["apple","banana"]|
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.