简体   繁体   中英

SQL Server : 5 days moving average for last month

I have a view with two columns TOTAL and DATE , the latter one excludes Saturdays and Sundays, ie

TOTAL      DATE
 0         1-1-2014
33         2-1-2014
11         3-1-2014
55         5-1-2014
... 
25         15-1-2014
35         16-1-2014
17         17-1-2014
40         20-1-2014
33         21-1-2014
...

The task that I'm trying to complete is counting 5 days TOTAL average for the whole month, ie between 13th and 17th, 14th and 20th (we skip weekends), 15th and 21st etc. up to current date. And YES, they ARE OVERLAPPING RANGES.

Any idea how to achieve it in SQL?

Example of the output (starting from the 6th and using fake numbers)

 5daysAVG        Start_day
 22               1-01-2014   <-counted between 1st to 6th Jan excl 4 and 5 of Jan
 25               2-01-2014   <- Counted between 2nd to 7th excluding 4 and 5
 27               3-01-2014   <- 3rd to 8th excluding 4/5
 24               6-01-2014   <-6th to 10th
 ...
 33               today-5

Okay, I usually set up some test data to play with.

Here is some code to create a [work] table in tempdb. I am skipping weekends. The total is a random number from 0 to 40.

-- Just playing
use tempdb;
go

-- drop existing
if object_id ('work') > 0
drop table work
go

-- create new
create table work
(
  my_date date,
  my_total int
);
go

-- clear data
truncate table work;
go

-- Monday = 1 
SET DATEFIRST 1;
GO

-- insert data
declare @dt date = '20131231';
declare @hr int;
while (@dt < '20140201')
begin
  set @hr = floor(rand(checksum(newid())) * 40);
  set @dt = dateadd(d, 1, @dt);
  if (datepart(dw, @dt) < 6)
    insert into work values (@dt, @hr);
end
go

This becomes real easy in SQL SERVER 2012 with the new LEAD() window function.

-- show data
with cte_summary as
(
select 
  row_number() over (order by my_date) as my_num,
  my_date, 
  my_total,
  LEAD(my_total, 0, 0) OVER (ORDER BY my_date) +
  LEAD(my_total, 1, 0) OVER (ORDER BY my_date) +
  LEAD(my_total, 2, 0) OVER (ORDER BY my_date) +
  LEAD(my_total, 3, 0) OVER (ORDER BY my_date) +
  LEAD(my_total, 4, 0) OVER (ORDER BY my_date) as my_sum,
  (select count(*) from work) as my_cnt
from work 
)
select * from cte_summary 
where my_num <= my_cnt - 4

Basically, we give a row number to each row, calculate the sum for rows 0 (current) to row 4 (4 away) and a total count.

Since this is a running total for five periods, the remaining dates have missing data. Therefore, we toss them out. my_row <= my_cnt -4

I hope this solves your problem!

If you are only caring about one number for the month, change the select to the following. I left the other rows in for you to get an understanding of what is going on.

select avg(my_sum/5) as my_stat
from cte_summary 
where my_num <= my_cnt - 4

FOR SQL SERVER < 2012 & >= 2005

Like anything in this world, there is always a way to do it. I used a small tally table to loop thru the data and collect sets of 5 data points for averages.

-- show data
with
cte_tally as
(
select 
    row_number() over (order by (select 1)) as n
from 
    sys.all_columns x 
),
cte_data as
(
select 
    row_number() over (order by my_date) as my_num,
    my_date, 
    my_total 
from 
    work 
)
select 
  (select my_date from cte_data where my_num = n) as the_date,
  (
    select sum(my_total) / 5 
    from cte_data 
    where my_num >= n and my_num < n+5
  ) as the_average
from cte_tally
where n <= (select count(*)-4 from work) 

Here is an explanation of the common table expressions (CTE).

cte_data = order data by date and give row numbers
cte_tally = a set based counting algorithm

For groups of five calculate an average and show the date.

在此输入图像描述

This solution does not depend on holidays or weekends. If data is there, it just partitions by groups of five order by date.

If you need to filter out holidays and weekends, create a holiday table. Add a where clause to cte_data that checks for NOT IN (SELECT DATE FROM HOLIDAY TABLE).

Good luck!

SQL Server offers the datepart(wk, ...) function to get the week of the year. Unfortunately, it uses the first day of the year to define the year.

Instead, you can find sequences of consecutive values and group them together:

select min(date), max(date, avg(total*1.0)
from (select v.*, row_number() over (order by date) as seqnum
      from view
     ) v
group by dateadd(day, -seqnum, date);

The idea is that subtracting a sequence of numbers from a sequence of consecutive days yields a constant.

You can also do this by using a canonical date and dividing by 7:

select min(date), max(date, avg(total*1.0)
from view v
group by datediff(day, '2000-01-03', date) / 7;

The date '2000-01-03' is an arbitrary Monday.

EDIT:

You seem to want a 5-day moving average. Because there is missing data for the weekends, avg() should just work:

select v1.date, avg(v2.value)
from view v1 join
     view v2
     on v2.date >= v1.date and v2.date < dateadd(day, 7, v1.date)
group by v1.date;

Here's a solution that works in SQL 2008;

The concept here is to use a table variable to normalize the data first; the rest is simple math to count and average the days.

By normalizing the data, I mean, get rid of weekend days, and assign ID's in a temporary table variable that can be used to identify the rows;

Check it out: ( SqlFiddle also here )

-- This represents your original source table
Declare @YourSourceTable Table 
(
    Total Int, 
    CreatedDate DateTime
)

-- This represents some test data in your table with 2 weekends
Insert Into @YourSourceTable Values (0, '1-1-2014')
Insert Into @YourSourceTable Values (33, '1-2-2014')
Insert Into @YourSourceTable Values (11, '1-3-2014')
Insert Into @YourSourceTable Values (55, '1-4-2014')
Insert Into @YourSourceTable Values (25, '1-5-2014')
Insert Into @YourSourceTable Values (35, '1-6-2014')
Insert Into @YourSourceTable Values (17, '1-7-2014')
Insert Into @YourSourceTable Values (40, '1-8-2014')
Insert Into @YourSourceTable Values (33, '1-9-2014')
Insert Into @YourSourceTable Values (43, '1-10-2014')
Insert Into @YourSourceTable Values (21, '1-11-2014')
Insert Into @YourSourceTable Values (5, '1-12-2014')
Insert Into @YourSourceTable Values (12, '1-13-2014')
Insert Into @YourSourceTable Values (16, '1-14-2014')

-- Just a quick test to see the source data
Select * From @YourSourceTable

/*  Now we need to normalize the data; 
    Let's just remove the weekends and get some consistent ID's to use in a separate table variable
    We will use DateName SQL Function to exclude weekend days while also giving 
    sequential ID's to the remaining data in our temporary table variable,
    which are easier to query later 
*/

Declare @WorkingTable Table 
(
    TempID Int Identity, 
    Total Int, 
    CreatedDate DateTime
)

-- Let's get the data normalized:
Insert Into 
    @WorkingTable
Select 
    Total,  
    CreatedDate
From @YourSourceTable
Where DateName(Weekday, CreatedDate) != 'Saturday' 
    And DateName(Weekday, CreatedDate) != 'Sunday' 

-- Let's run a 2nd quick sanity check to see our normalized data
Select * From @WorkingTable

/*  Now that data is normalized, we can just use the ID's to get each 5 day range and
    perform simple average function on the columns; I chose to use a CTE here just to
    be able to query it and drop the NULL ranges (where there wasn't 5 days of data)
    without having to recalculate each average
*/

; With rangeCte (StartDate, TotalAverage)
As
(
    Select
        wt.createddate As StartDate,
        (
            wt.Total +  
            (Select Total From @WorkingTable Where TempID = wt.TempID + 1) +
            (Select Total From @WorkingTable Where TempID = wt.TempID + 2) +
            (Select Total From @WorkingTable Where TempID = wt.TempID + 3) +
            (Select Total From @WorkingTable Where TempID = wt.TempID + 4)
        ) / 5
        As TotalAverage
From 
    @WorkingTable wt    
)
Select 
    StartDate,
    TotalAverage
From rangeCte
Where TotalAverage 
    Is Not Null

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM