I need to create a new column in a Bigquery table. This new column values must be an array of the colour column names based on a condition. My table "Users" looks like:
User | Red | Blue | Green | Yellow | Other columns not related |
---|---|---|---|---|---|
First | 1 | 1 | 0 | 1 | |
Second | 0 | 0 | 1 | 0 | |
Third | 1 | 1 | 0 | 0 |
and the expected table or view must be something like this:
User | Red | Blue | Green | Yellow | Array_col |
---|---|---|---|---|---|
First | 1 | 1 | 0 | 1 | Red, Blue, Yellow |
Second | 0 | 0 | 1 | 0 | Green |
Third | 1 | 1 | 0 | 0 | Red, Blue |
I am relatively new to SQL and Bigquery sintax, I've tried creating a view and also with
ALTER TABLE myproject.mydataset.users`
and I get the column names with this:
SELECT column_names
FROM `myproject.mydataset`.INFORMATION_SCHEMA.COLUMNS
WHERE (table_name = 'users')
But I can't find how to create the array list and make a column with them. I'm crashing against a wall.
Maybe this will be ok:
with mytable as (
select 'First' as user, 1 as Red, 1 as Blue, 0 as Green, 1 as Yellow union all
select 'Second' as user, 0 as Red, 0 as Blue, 1 as Green, 0 as Yellow union all
select 'Third' as user, 1 as Red, 1 as Blue, 0 as Green, 0 as Yellow
)
select *, split(trim(if(Red=1, 'Red', '') || if(Blue=1, ' Blue', '') || if(Green=1, ' Green', '') || if(Yellow=1, ' Yellow', ''), ' '), ' ')
from mytable
Now you can try super new syntax for BigQuery! Let's do this with UNPIVOT
statement:
WITH mytable AS (
SELECT 'first' AS user, 1 AS Red, 1 AS Blue, 0 AS Green, 1 AS Yellow UNION ALL
SELECT 'second' AS user, 0 AS Red, 0 AS Blue, 1 AS Green, 0 AS Yellow UNION ALL
SELECT 'third' AS user, 1 AS Red, 1 AS Blue, 0 AS Green, 0 AS Yellow
)
SELECT t.*, Array_col FROM (
SELECT user, ARRAY_AGG(color_name) Array_col FROM mytable
UNPIVOT(check_color FOR color_name IN (Red,Blue,Green,Yellow))
WHERE check_color = 1
GROUP BY user
)
JOIN mytable t USING (user)
Yes, the column names are always the same...
In case if you want to consider only those 4 columns - check below
select t1.*,
array(select color from t2.colors where value = 1) Array_col
from mytable t1,
unnest([struct([struct<value int64, color string>
(Red, 'Red'), (Blue, 'Blue'), (Green, 'Green'), (Yellow, 'Yellow')
] as colors)]) t2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.