如何将 SUM 函数与具有相同值的 OVER 子句一起使用，以便按列返回正确的总和？

Question

I have a scenario where I need to get the sum column using sql SUM function.我有一个场景，我需要使用 sql SUM 函数获取 sum 列。 I have a sample data like this:我有一个这样的示例数据：

Sample table:示例表：

dateCol,   myCol
-----------------------
'12:00:01'   3
'12:00:01'   4
'12:00:01'   5
'12:00:01'  NULL
'12:00:01'  NULL
'12:00:01'   3

I'm using the query shown below to get sum over myCol column我正在使用下面显示的查询来计算 myCol 列的总和

select 
    dateCol, myCol,
    sum(case when dateCol is not null  then 1 end) over (order by dateCol) as sumCol
from   
    sampleTable;

I get these results:我得到这些结果：

    dateCol myCol   sumCol
--------------------------
1   12:00:01    3       4
2   12:00:01    4       4
3   12:00:01    5       4
4   12:00:01    NULL    4
5   12:00:01    NULL    4
6   12:00:01    3       4

but I expect these results:但我期待这些结果：

    dateCol myCol   sumCol
--------------------------
1   12:00:01    3       1
2   12:00:01    4       2
3   12:00:01    5       3
4   12:00:01    NULL    3
5   12:00:01    NULL    3
6   12:00:01    3       4

How can I modify the query to return the expected result?如何修改查询以返回预期结果？

Answer 1

The default in SQL for cumulative sums is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW , not ROWS BETWEEN UNBOUNDED PRECEDING . SQL 中累积总和的默认值是RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ，而不是ROWS BETWEEN UNBOUNDED PRECEDING 。 You seem to have no way to distinguish the rows.您似乎无法区分行。

You can try an explicit window specification:您可以尝试明确的窗口规范：

select dateCol, myCol,
       count(dateCol) over (order by dateCol rows between unbounded preceding and current row) as sumCol
from sampleTable;

Notice that I also simplified the logic, using count() instead of sum() .请注意，我还简化了逻辑，使用count()而不是sum() 。

If you have a column to specify the ordering, then use that column in the order by :如果您有一列来指定排序，则在order by使用该列：

select dateCol, myCol,
       count(dateCol) over (order by dateCol, ?) as sumCol
from sampleTable;

That will make the sort stable and distinguish the rows.这将使排序稳定并区分行。

Absent that, you can create a column.如果没有，您可以创建一个列。 But, the results may be in a different order -- SQL tables represent unordered sets.但是，结果的顺序可能不同——SQL 表表示无序集。 So:所以：

select dateCol, myCol,
       count(dateCol) over (order by dateCol, seqnum) as sumCol
from (select st.*, row_number() over (order by dateCol) as seqnum
      from sampleTable
     ) st;

Answer 2

I will try to explain using standard SQL.我将尝试使用标准 SQL 进行解释。 You are trying to group dateCol, myCol with agregate function sum.您正在尝试使用聚合函数 sum 对 dateCol、myCol 进行分组。 Basically you need to define GROUP BY clause, and result view can be sorted using ordinary order by clause基本上你需要定义 GROUP BY 子句，结果视图可以使用普通的 order by 子句进行排序

  select dateCol, myCol,
         sum(case when dateCol is not null  then 1 else 0 end)  as sumCol
  from sampleTable
  group by dateCol, myCol
  order by dateCol

如何将 SUM 函数与具有相同值的 OVER 子句一起使用，以便按列返回正确的总和？

问题描述

2 个解决方案

解决方案1
1 2019-06-27 15:10:06

解决方案2
0 2019-06-27 15:18:10

如何将 SUM 函数与具有相同值的 OVER 子句一起使用，以便按列返回正确的总和？

问题描述

2 个解决方案

解决方案1 1 2019-06-27 15:10:06

解决方案2 0 2019-06-27 15:18:10

解决方案1
1 2019-06-27 15:10:06

解决方案2
0 2019-06-27 15:18:10