簡體   English   中英

如何在 BigQuery SQL 中添加 arrays?

[英]How do I add arrays in BigQuery SQL?

我有一個 UDF,它為表的每一行返回一個相同大小的浮點數組。 如何求和這些 arrays 的值?

換句話說,我該怎么做:

create temp function f(...)
returns array<float64>
...;
select sum(f(column)) from table

作為此操作的結果,我需要獲得另一個大小相等的數組,其中

result[i] = sum(over rows) f(row, column)[i]

因此,根據您的評論,您要查找的是所有數組的值之sum 這是使用UNNEST運算符的方法

WITH mydata  AS (
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
  union all 
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
  union all 
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
)

SELECT SUM(eachelement) from mydata, UNNEST(myarray) AS eachelement; 

這是一個使用ANY TYPE的函數,以支持對FLOAT64INT64NUMERIC數組以及一些示例輸入進行求和:

CREATE TEMP FUNCTION ElementWiseSum(arr1 ANY TYPE, arr2 ANY TYPE) AS (
  ARRAY(SELECT x + arr2[OFFSET(off)] FROM UNNEST(arr1) AS x WITH OFFSET off ORDER BY off)
);

SELECT arr1, arr2, ElementWiseSum(arr1, arr2) AS result
FROM (
  SELECT [1, 2, 3] AS arr1, [4, 5, 6] AS arr2 UNION ALL
  SELECT [7, 8], [9, 10] UNION ALL
  SELECT [], [] UNION ALL
  SELECT [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]
);

它使用WITH OFFSET取消嵌套arr1 ,然后使用此偏移量從arr2檢索等效元素,並按該偏移量排序以確保保留元素順序。

編輯:要對各行求和,可以對數組進行嵌套,計算按元素偏移量分組的總和,然后將總和重新聚集到新數組中:

SELECT
  ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
  SELECT
    off,
    SUM(x) AS sum
  FROM (
    SELECT [1, 2, 3] AS arr UNION ALL
    SELECT [7, 8, 9] UNION ALL
    SELECT [4, 5, 6] UNION ALL
    SELECT [10, 11, 12]
  ), UNNEST(arr) AS x WITH OFFSET off
  GROUP BY off
);

如果定義了UDF(在您的列中輸入並返回預定(或固定)尺寸的float64數組),則可以使用簡化的解決方案。 例如,對於3維數組,類似:

create temp function f(...)
returns array<float64>
...;

with dataset as (
  select arr[offset(0)] as col_a, arr[offset(1)] as col_b, arr[offset(2)] as col_c
    from (
       select f(mycolumn) as arr
       from `mydataset.mytable`
    )
)

select [sum(col_a), sum(col_b), sum(col_c)] as new_array from dataset

這並沒有直接回答 OP 的問題,但是人們登陸此頁面搜索“如何在 BigQuery SQL 中添加 arrays?” 可能會受益。

(基於@elliott-brossard 答案編輯)如果您有 2 個 arrays,但 1 個數組包含一個結構,您可以使用以下代碼將它們加在一起:

WITH mydata AS (
  SELECT
    [1, 2, 3] AS arr
    -- ,[7, 8, 9] AS arr2
    ,[
      STRUCT(7 AS timeOnSite)
      ,STRUCT(8 AS timeOnSite)
      ,STRUCT(9 AS timeOnSite)
    ] AS arr2
)

SELECT
  (
    SELECT
      ARRAY_AGG(sum ORDER BY off) AS arr
    FROM (
      SELECT
        off,
        SUM(x) AS sum
      FROM (
        SELECT arr UNION ALL
        -- SELECT arr2
        SELECT (SELECT ARRAY_AGG(t.timeOnSite) FROM UNNEST(arr2) AS t)
      ), UNNEST(arr) AS x WITH OFFSET off
      GROUP BY off
    )  
  ) AS sum_arrays
FROM 
  mydata

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM