[英]Cumulative sum over a table
What is the best way to perform a cumulative sum over a table in Postgres, in a way that can bring the best performance and flexibility in case more fields / columns are added to the table. 在Postgres中对表执行累积和的最佳方法是什么,以便在将更多字段/列添加到表中时可以带来最佳性能和灵活性。
Table 表
a b d 1 59 15 181 2 16 268 3 219 4 102
Cumulative 累积的
a b d 1 59 15 181 2 31 449 3 668 4 770
Window functions for running sum. 用于运行总和的窗口函数。
SELECT sum(a) OVER (ORDER BY d) as "a",
sum(b) OVER (ORDER BY d) as "b",
sum(d) OVER (ORDER BY d) as "d"
FROM table;
If you have more than one running sum, make sure the orders are the same. 如果您有多个运行总和,请确保订单相同。
It's important to note that if you want your columns to appear as the aggregate table in your question (each field uniquely ordered), it'd be a little more involved. 重要的是要注意,如果您希望您的列在您的问题中显示为聚合表(每个字段唯一排序),那么它会更复杂一些。
Update: I've modified the query to do the required sorting, without a given common field. 更新:我已经修改了查询以执行所需的排序,没有给定的公共字段。
WITH
rcd AS (
select row_number() OVER() as num,a,b,d
from tbl
),
sorted_a AS (
select row_number() OVER(w1) as num, sum(a) over(w2) a
from tbl
window w1 as (order by a nulls last),
w2 as (order by a nulls first)
),
sorted_b AS (
select row_number() OVER(w1) as num, sum(b) over(w2) b
from tbl
window w1 as (order by b nulls last),
w2 as (order by b nulls first)
),
sorted_d AS (
select row_number() OVER(w1) as num, sum(d) over(w2) d
from tbl
window w1 as (order by d nulls last),
w2 as (order by d nulls first)
)
SELECT sorted_a.a, sorted_b.b, sorted_d.d
FROM rcd
JOIN sorted_a USING(num)
JOIN sorted_b USING(num)
JOIN sorted_d USING(num)
ORDER BY num;
You can use window functions, but you need additional logic to avoid values where there are NULL
s: 您可以使用窗口函数,但是需要额外的逻辑来避免存在
NULL
的值:
SELECT id,
(case when a is not null then sum(a) OVER (ORDER BY id) end) as a,
(case when b is not null then sum(b) OVER (ORDER BY id) end) as b,
(case when d is not null then sum(d) OVER (ORDER BY id) end) as d
FROM table;
This assumes that the first column that specifies the ordering is called id
. 这假定指定排序的第一列称为
id
。
I think what you are really looking for is this: 我认为你真正想要的是:
SELECT id
, sum(a) OVER (PARTITION BY a_grp ORDER BY id) as a
, sum(b) OVER (PARTITION BY b_grp ORDER BY id) as b
, sum(d) OVER (PARTITION BY d_grp ORDER BY id) as d
FROM (
SELECT *
, count(a IS NULL OR NULL) OVER (ORDER BY id) as a_grp
, count(b IS NULL OR NULL) OVER (ORDER BY id) as b_grp
, count(d IS NULL OR NULL) OVER (ORDER BY id) as d_grp
FROM tbl
) sub
ORDER BY id;
The expression count(col IS NULL OR NULL) OVER (ORDER BY id)
forms groups of consecutive non-null rows for a
, b
and d
in the subquery sub
. 表达式
count(col IS NULL OR NULL) OVER (ORDER BY id)
在子查询sub
为a
, b
和d
形成连续的非空行组。
In the outer query we run cumulative sums per group . 在外部查询中,我们运行每组的累积总和。
NULL
values form their own group and stay NULL
automatically. NULL
值形成自己的组并自动保持NULL
。 No additional CASE
statement necessary. 无需额外的
CASE
声明。
SQL Fiddle (with some added values for column a
to demonstrate the effect). SQL Fiddle (列
a
一些附加值用于演示效果)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.