[英]How do I add arrays in BigQuery SQL?
I have a UDF which returns a floating point array of the same size for each row of a table.我有一个 UDF,它为表的每一行返回一个相同大小的浮点数组。 How do I sum values of these arrays?如何求和这些 arrays 的值?
In other words, how can I do something like this:换句话说,我该怎么做:
create temp function f(...)
returns array<float64>
...;
select sum(f(column)) from table
As the result of this operation I need to get another array of equal size where作为此操作的结果,我需要获得另一个大小相等的数组,其中
result[i] = sum(over rows) f(row, column)[i]
So based on your comment, what you are looking for is the sum
the values of all your arrays. 因此,根据您的评论,您要查找的是所有数组的值之sum
。 This is how you can do it using UNNEST
operator 这是使用UNNEST
运算符的方法
WITH mydata AS (
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
union all
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
union all
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
)
SELECT SUM(eachelement) from mydata, UNNEST(myarray) AS eachelement;
Here is a function that uses ANY TYPE
in order to support summing arrays of FLOAT64
, INT64
, or NUMERIC
along with some sample input: 这是一个使用ANY TYPE
的函数,以支持对FLOAT64
, INT64
或NUMERIC
数组以及一些示例输入进行求和:
CREATE TEMP FUNCTION ElementWiseSum(arr1 ANY TYPE, arr2 ANY TYPE) AS (
ARRAY(SELECT x + arr2[OFFSET(off)] FROM UNNEST(arr1) AS x WITH OFFSET off ORDER BY off)
);
SELECT arr1, arr2, ElementWiseSum(arr1, arr2) AS result
FROM (
SELECT [1, 2, 3] AS arr1, [4, 5, 6] AS arr2 UNION ALL
SELECT [7, 8], [9, 10] UNION ALL
SELECT [], [] UNION ALL
SELECT [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]
);
It unnests arr1
using WITH OFFSET
, then retrieves the equivalent element from arr2
using this offset, and orders by the offset to ensure that the element order is preserved. 它使用WITH OFFSET
取消嵌套arr1
,然后使用此偏移量从arr2
检索等效元素,并按该偏移量排序以确保保留元素顺序。
Edit: to sum across rows, you can unnest the arrays, compute sums grouped by the offset of the elements, then reaggregate the sums into a new array: 编辑:要对各行求和,可以对数组进行嵌套,计算按元素偏移量分组的总和,然后将总和重新聚集到新数组中:
SELECT
ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
SELECT
off,
SUM(x) AS sum
FROM (
SELECT [1, 2, 3] AS arr UNION ALL
SELECT [7, 8, 9] UNION ALL
SELECT [4, 5, 6] UNION ALL
SELECT [10, 11, 12]
), UNNEST(arr) AS x WITH OFFSET off
GROUP BY off
);
If you have your UDF defined (takes in a your column(s) and returns a float64
array of a pre-determined (or fixed) dimensions), you can use a simplified solution. 如果定义了UDF(在您的列中输入并返回预定(或固定)尺寸的float64
数组),则可以使用简化的解决方案。 For example in case of 3-d arrays, something like: 例如,对于3维数组,类似:
create temp function f(...)
returns array<float64>
...;
with dataset as (
select arr[offset(0)] as col_a, arr[offset(1)] as col_b, arr[offset(2)] as col_c
from (
select f(mycolumn) as arr
from `mydataset.mytable`
)
)
select [sum(col_a), sum(col_b), sum(col_c)] as new_array from dataset
This does not directly answer OP's question, but people landing on this page searching for "How do I add arrays in BigQuery SQL?"这并没有直接回答 OP 的问题,但是人们登陆此页面搜索“如何在 BigQuery SQL 中添加 arrays?” might benefit.可能会受益。
(Based on @elliott-brossard answer edit) In case you have 2 arrays, but 1 array includes a struct, you can use the following code to add them together: (基于@elliott-brossard 答案编辑)如果您有 2 个 arrays,但 1 个数组包含一个结构,您可以使用以下代码将它们加在一起:
WITH mydata AS (
SELECT
[1, 2, 3] AS arr
-- ,[7, 8, 9] AS arr2
,[
STRUCT(7 AS timeOnSite)
,STRUCT(8 AS timeOnSite)
,STRUCT(9 AS timeOnSite)
] AS arr2
)
SELECT
(
SELECT
ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
SELECT
off,
SUM(x) AS sum
FROM (
SELECT arr UNION ALL
-- SELECT arr2
SELECT (SELECT ARRAY_AGG(t.timeOnSite) FROM UNNEST(arr2) AS t)
), UNNEST(arr) AS x WITH OFFSET off
GROUP BY off
)
) AS sum_arrays
FROM
mydata
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.