简体   繁体   English

如何添加一个列,该列根据另一列的条件对其他列的值进行字符串加法

[英]How to add a column which does a string addition of values from other column based on condition from another column

I have a table like this:我有一张这样的桌子:

column1       column2
product1    action_of_interest1
product2    action_of_interest1
product3    random_action
product1    action_of_interest2

I want to add a new_column (an entry separated by comma) which adds/concatenates entries from column1 whenever column2 value is action_of_interest1 and subtracts the running concatenated array whenever column2 value is `action_of_interest2.我想添加一个new_column (一个用逗号分隔的条目),它在 column2 值为 action_of_interest1 时添加/连接来自 column1 的条目,并在 column2 值为 ` action_of_interest1时减去运行的连接数组。 For random_actions, do nothing just print whatever is the current array.对于 random_actions,什么也不做,只打印当前数组。

This is the resulting table:这是结果表:

column 1          column2                         new_column
product1     action_of_interest1                   product1
product2     action_of_interest1                  product1,product2
product3     random_action                    product1,product2
product1     action_of_interest2                  product2

How to do this in BigQuery/SQL?如何在 BigQuery/SQL 中执行此操作?

An approach using window functions:使用 window 函数的方法:

WITH sample AS (
  SELECT "product1" AS column1, "action_of_interest1" AS column2, 1 AS column3
  UNION ALL
  SELECT "product2" AS column1, "action_of_interest1" AS column2, 2 AS column3
  UNION ALL
  SELECT "product3" AS column1, "random_action" AS column2, 3 AS column3
  UNION ALL
  SELECT "product1" AS column1, "action_of_interest2" AS column2, 4 AS column3
),
running_agg as (
  SELECT
    *,
    ARRAY_AGG(IF(column2 = 'action_of_interest1', column1, '')) OVER (ORDER BY column3 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as action_of_interest1,
    ARRAY_AGG(IF(column2 = 'action_of_interest2', column1, '')) OVER (ORDER BY column3 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as action_of_interest2
  FROM sample
)
SELECT 
  * EXCEPT (action_of_interest1, action_of_interest2),
  ARRAY_TO_STRING(
    ARRAY(
      SELECT * FROM UNNEST(action_of_interest1) 
      EXCEPT DISTINCT 
      SELECT * FROM UNNEST(action_of_interest2)
    ), 
    ','
  ) AS new_column
FROM running_agg

Output: Output:

column1     column2                 column3     new_column
product1    action_of_interest1     1           product1
product2    action_of_interest1     2           product1,product2
product3    random_action           3           product1,product2
product1    action_of_interest2     4           product2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据 bigquery 中的条件从现有列添加派生列 - Add derived column from existing column based on a condition in bigquery 根据条件从一列中提取数据并存储在另一列中 - Extracting data from one column and storing in another based on a condition 如何根据 BigQuery 中另一列的条件显示值的计数 - How to show a count of values based on condition of another column in BigQuery SQL CountIf 根据平均值从另一列满足条件 - SQL CountIf a condition is met from another column based on average 添加具有基于另一个日期时间列的值的日期时间列 - Add datetime column with values based on another datetime column SQL - 根据其他列的值重命名一列 - SQL - Rename one column based on values from other columns 根据 redshift 中另一列的值创建列 - Create column based on values on another column in redshift 将列的值与其他列进行比较,并根据条件从第三列中选择值 sql - Compare column's value with other column and base on condition choose value from third column sql Bigquery:根据另一个表中设置的条件更新列 - Bigquery: Update column based on condition set in another table CASE WHEN 错误 - 根据另一列的条件将值设置为列 - CASE WHEN error - setting value to column based on conditions from another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM