[英]Add derived column from existing column based on a condition in bigquery
[英]How to add a column which does a string addition of values from other column based on condition from another column
我有一張這樣的桌子:
column1 column2
product1 action_of_interest1
product2 action_of_interest1
product3 random_action
product1 action_of_interest2
我想添加一個new_column
(一個用逗號分隔的條目),它在 column2 值為 action_of_interest1 時添加/連接來自 column1 的條目,並在 column2 值為 ` action_of_interest1
時減去運行的連接數組。 對於 random_actions,什么也不做,只打印當前數組。
這是結果表:
column 1 column2 new_column
product1 action_of_interest1 product1
product2 action_of_interest1 product1,product2
product3 random_action product1,product2
product1 action_of_interest2 product2
如何在 BigQuery/SQL 中執行此操作?
使用 window 函數的方法:
WITH sample AS (
SELECT "product1" AS column1, "action_of_interest1" AS column2, 1 AS column3
UNION ALL
SELECT "product2" AS column1, "action_of_interest1" AS column2, 2 AS column3
UNION ALL
SELECT "product3" AS column1, "random_action" AS column2, 3 AS column3
UNION ALL
SELECT "product1" AS column1, "action_of_interest2" AS column2, 4 AS column3
),
running_agg as (
SELECT
*,
ARRAY_AGG(IF(column2 = 'action_of_interest1', column1, '')) OVER (ORDER BY column3 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as action_of_interest1,
ARRAY_AGG(IF(column2 = 'action_of_interest2', column1, '')) OVER (ORDER BY column3 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as action_of_interest2
FROM sample
)
SELECT
* EXCEPT (action_of_interest1, action_of_interest2),
ARRAY_TO_STRING(
ARRAY(
SELECT * FROM UNNEST(action_of_interest1)
EXCEPT DISTINCT
SELECT * FROM UNNEST(action_of_interest2)
),
','
) AS new_column
FROM running_agg
Output:
column1 column2 column3 new_column
product1 action_of_interest1 1 product1
product2 action_of_interest1 2 product1,product2
product3 random_action 3 product1,product2
product1 action_of_interest2 4 product2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.