简体   繁体   English

在SQL上运行超过日期的总和

[英]Running total sum over date presto SQL

I'm trying to calculate the cumulative sum of columns t and s over a date from my sample data below, using Presto SQL. 我正在尝试使用Presto SQL从下面的示例数据中计算出日期中t和s列的累积总和。

Date   | T | S 
1/2/19 | 2 | 5
2/1/19 | 5 | 1
3/1/19 | 1 | 1

I would like to get 我想得到

Date   | T | S | cum_T | cum_S 
1/2/19 | 2 | 5 |    2  |  5 
2/1/19 | 5 | 1 |    7  |  6
3/1/19 | 1 | 1 |    8  |  7

However when I run the below query using Presto SQL I am receiving an unexpected error message, telling me to put columns T and S into the group by section of my query. 但是,当我使用Presto SQL运行以下查询时,收到一条意外的错误消息,告诉我将列T和S放入查询的分组中。

Is this expected? 这是预期的吗? When I remove the group by from my query it runs without error, but produces duplicate date rows. 当我从查询中删除分组依据时,它运行无误,但产生重复的日期行。 + +

select
  date_trunc('day',tb1.date),
  sum(tb1.S) over (partition by date_trunc('day',tb1.date) order by date_trunc('day',tb1.date) rows unbounded preceding )  as cum_S,
  sum(tb1.T) over (partition by date_trunc('day',tb1.date) order by date_trunc('day',tb1.date) rows unbounded preceding)  as cum_T
from esi_dpd_bi_esds_prst.points_tb1_use_dedup_18months_vw tb1
where 
  tb1.reason_id not in (45,264,418,983,990,997,999,1574)
  and tb1.group_id not in (22)
  and tb1.point_status not in (3)
  and tb1.date between cast(DATE '2019-01-01' as date) and cast( DATE '2019-01-03' as date)
group by 
    1
order by date_trunc('day',tb1.date) desc 

Error looks like this: 错误看起来像这样:

Error: line 3:1: '"sum"(tb1.S) OVER (PARTITION BY "date_trunc"('day', tb1.tb1) ORDER BY "date_trunc"('day', tb1.tb1) ASC ROWS UNBOUNDED PRECEDING)' must be an aggregate expression or appear in GROUP BY clause.

You have an aggregation query and you want to mix the aggregations with window functions. 您有一个聚合查询,并且想要将聚合与窗口函数混合在一起。 The correct syntax is: 正确的语法是:

select date_trunc('day', tb1.date),
       sum(tbl1.S) as S,
       sum(tbl1.T) as T,
       sum(sum(tb1.S)) over (order by date_trunc('day', tb1.date) rows unbounded preceding )  as cum_S,
       sum(sum(tb1.T)) over (order by date_trunc('day', tb1.date) rows unbounded preceding)  as cum_T
from esi_dpd_bi_esds_prst.points_tb1_use_dedup_18months_vw tb1
where tb1.reason_id not in (45, 264, 418, 983, 990, 997, 999, 1574) and
      tb1.group_id not in (22) and
      tb1.point_status not in (3) and
      tb1.date between cast(DATE '2019-01-01' as date) and cast( DATE '2019-01-03' as date)
group by 1
order by date_trunc('day', tb1.date) desc ;

That is, the window function is running after the aggregation and needs to process the aggregated value. 也就是说,窗口功能在汇总之后运行并且需要处理汇总值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM