T-SQL absence by month from start date end date

Question

I have an interesting query to do and am trying to find the best way to do it. Basically I have an absence table in our personnel database this records the staff id and then a start date and end date for the absence. End date being null if not yet entered (not returned). I cannot change the design.

They would like a report by month on number of absences (12 month trend). With staff being off over the month change it obviously may be difficult to calculate.

eg Staff off 25/11/08 to 05/12/08 (dd/MM/yy) I would want the days in November to go into the November count and the ones in December in the December count.

I am currently thinking in order to count the number of days I need to separate the start and end date into a record for each day of the absence, assigning it to the month it is in. then group the data for reporting. As for the ones without an end date I would assume null is the current date as they are presently still absent.

What would be the best way to do this?

Any better ways?

Edit: This is SQL 2000 server currently. Hoping for an upgrade soon.

Answer 1

I have had a similar issue where there has been a table of start/end dates designed for data storage but not for reporting.

I sought out the "fastest executing" solution and found that it was to create a 2nd table with the monthly values in there. I populated it with the months from Jan 2000 to Jan 2070. I'm expecting it will suffice or that I get a large pay cheque in 2070 to come and update it...

DECLARE TABLE months (start DATETIME)
-- Populate with all month start dates that may ever be needed
-- And I would recommend indexing / primary keying by start

SELECT
    months.start,
    data.id,
    SUM(CASE WHEN data.start < months.start
            THEN DATEDIFF(DAY, months.start, data.end)
            ELSE DATEDIFF(DAY, data.start, DATEADD(month, 1, months.start))
        END) AS days
FROM
    data
INNER JOIN
    months
        ON data.start < DATEADD(month, 1, months.start)
        AND data.end > months.start
GROUP BY
   months.start,
   data.id

That join can be quite slow for various reasons, I'll search out another answer to another question to show why and how to optimise the join.

EDIT:

Here is another answer relating to overlapping date ranges and how to speed up the joins...

Query max number of simultaneous events

T-SQL absence by month from start date end date

Question

1 answers

solution1
3 ACCPTED 2009-01-30 16:11:51

T-SQL absence by month from start date end date

Question

1 answers

solution1 3 ACCPTED 2009-01-30 16:11:51

solution1
3 ACCPTED 2009-01-30 16:11:51