简体   繁体   English

日期连续时的滚动总和

[英]Rolling Sum when date is continuous

I'm trying to find how many days people have continuously worked in SQL. I'm thinking a rolling sum might be the solution but don't know how to work it out.我试图找出人们在 SQL 中连续工作了多少天。我认为滚动总和可能是解决方案,但不知道如何解决。

My sample data is我的示例数据是

| Employee | work_period |
| 1        | 2019-01-01  |
| 1        | 2019-01-02  |
| 1        | 2019-01-03  |
| 1        | 2019-01-04  |
| 1        | 2019-01-05  |
| 1        | 2019-01-10  |
| 1        | 2019-01-11  |
| 1        | 2019-01-12  |
| 2        | 2019-01-20  |
| 2        | 2019-01-22  |
| 2        | 2019-01-23  |
| 2        | 2019-01-24  |

The designated result should be指定的结果应该是

| Employee | work_period | Continuous Days |
| 1        | 2019-01-01  | 1               |
| 1        | 2019-01-02  | 2               |
| 1        | 2019-01-03  | 3               |
| 1        | 2019-01-04  | 4               |
| 1        | 2019-01-05  | 5               |
| 1        | 2019-01-10  | 1               |
| 1        | 2019-01-11  | 2               |
| 1        | 2019-01-12  | 3               |
| 2        | 2019-01-20  | 1               |
| 2        | 2019-01-22  | 1               |
| 2        | 2019-01-23  | 2               |
| 2        | 2019-01-24  | 3               |

If the days are not continuous, the continuous counting will re-start from 1.如果天数不连续,则从1重新开始连续计数。

Just another option... Very similar to a Gaps-and-Islands, but without the final aggregation.只是另一种选择...非常类似于 Gaps-and-Islands,但没有最终聚合。

Example例子

Select Employee
      ,work_period
      ,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
 From  (
        Select *
              ,Grp = datediff(day,'1900-01-01',work_period) - row_number() over (partition by Employee Order by Work_Period) 
          From YourTable
       ) A

Returns退货

Employee    work_period Cont_Days
1           2019-01-01  1
1           2019-01-02  2
1           2019-01-03  3
1           2019-01-04  4
1           2019-01-05  5
1           2019-01-10  1
1           2019-01-11  2
1           2019-01-12  3
2           2019-01-20  1
2           2019-01-22  1
2           2019-01-23  2
2           2019-01-24  3

This is similar to John's answer but a bit simpler.这类似于 John 的回答,但更简单一些。

You can identify groups of adjacent rows by subtracting a sequence of numbers -- the difference is constant.您可以通过减去一系列数字来识别相邻行的组——差异是恒定的。 So:所以:

select Employee, work_period,
       row_number9) over (partition by employee, grp order by work_period) as day_counter
      ,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
from (select t.*,
             dateadd(day,
                     - row_number() over (partition by employee order by work_period),
                     work_period
                    ) as grp
      from t
     ) t;

Another interesting way to do this is to identify the rows where the "islands" start and then use datediff() :另一种有趣的方法是识别“岛屿”开始的行,然后使用datediff()

select t.*,
       datediff(day,
                max(case when island_start_flag = 1 then workperiod end) over (partition by employee order by workperiod),
                workperiod
               ) + 1 as days_counter
from (select t.*,
             (case when lag(workperiod) over (partition by employee order by workperiod) >= dateadd(day, -1, workperiod)
                   then 0 else 1
              end) as island_start_flag
      from t
     ) t;

You can first use lag() to check if the previous row (as sorted by work_period ) per employee has exactly day lees then the current row.您可以先使用lag()检查每个员工的前一行(按work_period排序)是否恰好有 day lees,然后是当前行。 Use that in a CASE expression that returns 0 if the condition is true and 0 otherwise.CASE表达式中使用它,如果条件为真则返回0 ,否则返回0 Then use the windowed version of sum() to sum up the 0 s and 1 s per employee in the order of work_period .然后使用sum()的窗口版本按work_period的顺序对每个员工的01求和。 That gives you a number per group of continuous days for each employee.这为您提供了每位员工每组连续天数。 You can then use this group number to PARTITION BY additionally to the user in a windowed version of sum() adding 1 for each row in the partition ordered by work_period .然后,您可以在sum()的窗口版本中使用此组号对用户进行PARTITION BY ,为按work_period排序的分区中的每一行添加1

SELECT employee,
       work_period,
       sum(1) OVER (PARTITION BY employee,
                                 g
                    ORDER BY work_period) continuous_days
       FROM (SELECT employee,
                    work_period,
                    sum(c) OVER (PARTITION BY employee
                                 ORDER BY work_period) g
                    FROM (SELECT employee,
                                 work_period,
                                 CASE
                                   WHEN lag(work_period) OVER (PARTITION BY employee
                                                               ORDER BY work_period) = dateadd(day, -1, work_period) THEN
                                     0
                                   ELSE
                                     1
                                 END c
                                 FROM elbat) x) y;

db<>fiddle 数据库<>小提琴

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM