简体   繁体   English

空值的累积求和

[英]Cumulative summation over null values

I have tried to calculate cumulative sum column to find out Present Working Employees in each month, but am getting NULL instead of present employee as per previous month. 我试图计算累计总和列以找出每个月的在职员工,但是按上个月却得到NULL而不是在职员工。

Table employees: 表员工:

id    date_started     date_terminated
1      01-Apr-14       NULL
2      21-Apr-14       NULL
3      11-Apr-14       NULL
4      01-Apr-14       NULL
5      01-Apr-14       NULL
6      05-Apr-14       NULL
7      01-Apr-14       NULL
8      01-Apr-14       NULL
9      01-Apr-14       NULL
10     29-Apr-14       NULL
11     21-Apr-14       NULL
12     01-Apr-14       NULL
13     01-Apr-14       NULL
14     01-Apr-14       NULL
15     05-Aug-14       NULL
16     01-Oct-1        NULL
17     13-Oct-14       NULL
18     22-Oct-14       NULL
19     25-Oct-14       NULL
10     29-Oct-14       NULL

Table dates: It containing date column which having data from 2011-Jan-01 to current date. 表格日期:包含date列,其中包含从2011-Jan-01到当前日期的数据。

Obtained result Table from my query : 从我的查询获得结果表:

+--------------------------------------------------------------+
| date                  | employee_joined | present_employees  |
+--------------------------------------------------------------+
| 2014-01-01 00:00:00-7 |            NULL |              NULL  |
| 2014-02-01 00:00:00-7 |            NULL |              NULL  |
| 2014-03-01 00:00:00-7 |            NULL |              NULL  |
| 2014-04-01 00:00:00-7 |              14 |                14  |
| 2014-05-01 00:00:00-7 |            NULL |              NULL  |
| 2014-06-01 00:00:00-7 |            NULL |              NULL  |
| 2014-07-01 00:00:00-7 |            NULL |              NULL  |
| 2014-08-01 00:00:00-7 |               1 |                15  |
| 2014-09-01 00:00:00-7 |            NULL |              NULL  |
| 2014-10-01 00:00:00-7 |               5 |                20  |
+--------------------------------------------------------------+

I am looking for resultant table: 我正在寻找结果表:

+--------------------------------------------------------------+
| date                  | employee_joined | present_employees  |
+--------------------------------------------------------------+
| 2014-01-01 00:00:00-7 |            NULL |              NULL  |
| 2014-02-01 00:00:00-7 |            NULL |              NULL  |
| 2014-03-01 00:00:00-7 |            NULL |              NULL  |
| 2014-04-01 00:00:00-7 |              14 |                14  |
| 2014-05-01 00:00:00-7 |            NULL |                14  |
| 2014-06-01 00:00:00-7 |            NULL |                14  |
| 2014-07-01 00:00:00-7 |            NULL |                14  |
| 2014-08-01 00:00:00-7 |               1 |                15  |
| 2014-09-01 00:00:00-7 |            NULL |                15  |
| 2014-10-01 00:00:00-7 |               5 |                20  |
+--------------------------------------------------------------+

I have tried to get data from below query: 我试图从下面的查询中获取数据:

/*-----ONLY FOR PRESENT EMPLOYEES USING CUMULATIVE SUM--------*/
WITH fdates AS 
    (
        SELECT DATE_TRUNC('month', d.date) AS date
        FROM dates d
        WHERE d.date::DATE <= '10-01-2014' AND
        d.date::DATE >= '01-01-2014'
        group by DATE_TRUNC('month', d.date)
    ),  
employeeJoin AS
    (
        SELECT COALESCE( COUNT(e.id), 0 ) AS employee_joined, 
            DATE_TRUNC( 'month', e.date_started) AS date_started
        FROM employees e GROUP BY DATE_TRUNC( 'month', e.date_started)
    ),
employeeJoinRownum AS
    (   
        SELECT employee_joined, date_started, row_number() OVER (order by date_started) rownum
        FROM employeeJoin
    ) 
SELECT d.*, employee_joined AS employee_joined,
        (SELECT sum(employee_joined) FROM employeeJoinRownum eJ2 WHERE eJ2.rownum <= eJ1.rownum) AS Total_Joined_Employees
    FROM fdates d
    LEFT OUTER JOIN employeeJoinRownum eJ1 ON( eJ1.date_started = DATE_TRUNC('month', d.date) )
    ORDER BY d.date

The following query counts the employees joined and employees left for each date and then uses a window function to accumulate the results. 以下查询计算每个日期的入职员工和离职员工,然后使用窗口函数累计结果。

SELECT
  dates.date,
  COUNT(DISTINCT ej.id) AS employee_joined,
  COUNT(DISTINCT el.id) AS employee_left,
  SUM(COUNT(DISTINCT ej.id) - COUNT(DISTINCT el.id)) OVER (ORDER BY dates.date) AS present_employees
FROM
  dates LEFT JOIN employees ej
ON
  ej.date_started = dates.date LEFT JOIN employees el
ON
  el.date_terminated = dates.date
GROUP BY
  dates.date;

In case you do not have a prefilled dates table, you can use the generate_series set returning function instead and left join to it. 如果没有预填的dates表,则可以改用generate_series集合返回函数,然后左键联接。

SELECT
  ...
FROM
  GENERATE_SERIES('2014-01-01', '2014-01-10', '1 day'::interval) dates LEFT JOIN employees ej
ON
  ...

You could normalize the table by creating a row for both a join and a terminate event: 您可以通过为联接和终止事件创建一行来规范化表:

select  welcome as date
,       1 as size_change
from    emps
union all
select  bye
,       -1
from    emps
where   bye is not null

Now you can use a running sum to calculate the current size: 现在,您可以使用运行总和来计算当前大小:

; with  events as
        (
        select  welcome as date
        ,       1 as size_change
        from    emps
        union all
        select  bye
        ,       -1
        from    emps
        where   bye is not null
        )
select  distinct to_char(date, 'YYYY-MM-DD') as date
,       sum(size_change) over (order by date) as family_size
from    events
order by
        date
;

Example at SQL Fiddle. SQL Fiddle中的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM