[英]How can I apply aggregate functions element-wise over arrays in PostgreSQL, e.g. weighted array sums over a group?
我有一個如下表(見db<>fiddle ):
grp | n | 瓦爾斯 |
---|---|---|
0 | 2 | {1,2,3,4} |
1 | 5 | {3,2,1,2} |
1 | 3 | {0,5,4,3} |
對於每個組(由grp
定義),我想執行一些涉及組的標量n
和 arrays vals
的算術。 我對一種加權和感興趣,這樣每一行的 val 都乘以它的n
,並且得到的vals
在每組中按元素求和,每組輸出一個數組:
grp | 結果 |
---|---|
0 | {2,4,6,8} |
1 | {15,25,17,19} |
這是我嘗試過的。 這失敗並出現錯誤( aggregate function calls cannot contain set-returning function calls
):
SELECT
grp,
ARRAY(SELECT SUM(n * UNNEST(vals)))
FROM
tbl
GROUP BY
grp
該錯誤包含一個提示,但我無法理解我的用例。
下面將所需的 arrays 匯總為標量:
SELECT
grp,
SUM(n * vals[i])
FROM
tbl,
generate_series(1, 4) i
GROUP BY
grp
只有這種作品:
SELECT
grp,
SUM(n * vals[1]),
SUM(n * vals[2]),
SUM(n * vals[3]),
SUM(n * vals[4])
FROM
tbl
GROUP BY
grp
但它不會產生一個數組,它涉及分別寫出數組的每個元素。 在我的情況下,arrays 比四個元素長得多,所以這太尷尬了。
WITH flattened AS (
SELECT grp, position, SUM(val * n) AS s
FROM tbl, unnest(vals) WITH ORDINALITY AS f(val, position)
GROUP BY grp, position
ORDER BY grp, position
)
SELECT grp, array_agg(s ORDER BY position)
FROM flattened
GROUP BY grp
;
+---+-------------------------------------------------------------------------------------+
|grp|array_agg |
+---+-------------------------------------------------------------------------------------+
|0 |{2.00000000000000000,4.00000000000000000,6.00000000000000000,8.00000000000000000} |
|1 |{15.00000000000000000,25.00000000000000000,17.00000000000000000,19.00000000000000000}|
+---+-------------------------------------------------------------------------------------+
解釋:
您可以使用UNNEST... WITH ORDINALITY
來跟蹤每個值的 position:
SELECT grp, position, val, n
FROM tbl, unnest(vals) WITH ORDINALITY AS f(val, position);
+---+--------+---+-+
|grp|position|val|n|
+---+--------+---+-+
|0 |1 |1 |2|
|0 |2 |2 |2|
|0 |3 |3 |2|
|0 |4 |4 |2|
|1 |1 |3 |5|
|1 |2 |2 |5|
|1 |3 |1 |5|
|1 |4 |2 |5|
|1 |1 |0 |3|
|1 |2 |5 |3|
|1 |3 |4 |3|
|1 |4 |3 |3|
+---+--------+---+-+
然后GROUP BY
原始組和每個 position:
SELECT grp, position, SUM(val * n) AS s
FROM tbl, unnest(vals) WITH ORDINALITY AS f(val, position)
GROUP BY grp, position
ORDER BY grp, position;
+---+--------+--+
|grp|position|s |
+---+--------+--+
|0 |1 |2 |
|0 |2 |4 |
|0 |3 |6 |
|0 |4 |8 |
|1 |1 |15|
|1 |2 |25|
|1 |3 |17|
|1 |4 |19|
+---+--------+--+
那么你只需要答案中的ARRAY_AGG
。
我會為此編寫函數,否則 SQL 會變得非常混亂。
一個 function 將所有元素與給定值相乘:
create function array_mul(p_input real[], p_mul int)
returns real[]
as
$$
select array(select i * p_mul
from unnest(p_input) with ordinality as t(i,idx)
order by idx);
$$
language sql
immutable;
還有一個 function 用作匯總具有相同索引的元素的聚合:
create or replace function array_add(p_one real[], p_two real[])
returns real[]
as
$$
declare
l_idx int;
l_result real[];
begin
if p_one is null or p_two is null then
return coalesce(p_one, p_two);
end if;
for l_idx in 1..greatest(cardinality(p_one), cardinality(p_two)) loop
l_result[l_idx] := coalesce(p_one[l_idx],0) + coalesce(p_two[l_idx], 0);
end loop;
return l_result;
end;
$$
language plpgsql
immutable;
這可用於定義自定義聚合:
create aggregate array_element_sum(real[]) (
sfunc = array_add,
stype = real[],
initcond = '{}'
);
然后你的查詢很簡單:
select grp, array_element_sum(array_mul(vals, n))
from tbl
group by grp;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.