简体   繁体   中英

Combine SQL records based on dates

I have the following table:

startDate   endDate
----------------------
01-01-2014  01-07-2014
01-08-2014  01-14-2014
01-15-2014  01-21-2014

01-28-2014  02-03-2014
02-04-2014  02-10-2014

I want to bundle the dates together to minimize the amount of records. The output result should look like this:

startDate   endDate
----------------------
01-01-2014  01-21-2014
01-28-2014  02-10-2014

The dates can be linked together if the the startDate of one record is one day later than the endDate of another.

Can this be achieved without using cursors?

As mentioned in comments you need to Recursive CTE plus Window Function to aggregate the consecutive days

;WITH cte
     AS (SELECT StartDate,
                EndDate
         FROM   yourtable
         UNION ALL
         SELECT a.StartDate,
                b.EndDate
         FROM   cte a
                JOIN yourtable b
                  ON Dateadd(DAY, 1, a.EndDate) = b.StartDate),
     cte1
     AS (SELECT StartDate,
                EndDate,
                Row_number()
                  OVER(
                    partition BY EndDate
                    ORDER BY StartDate ASC) AS rn
         FROM   cte)
SELECT StartDate,
       Max(EndDate) AS EndDate
FROM   cte1 a
WHERE  a.rn = 1
GROUP  BY StartDate
ORDER  BY EndDate 

SQLFIDDLE DEMO

Note: The below code is SQL Server 2012 compatible

Use the below code:

SELECT MIN(startdate), MAX(enddate) 
FROM
(
    SELECT *, SUM(diff) OVER(ORDER BY startdate) AS cat FROM
    (
        SELECT 
            startdate, 
            endDate, 
            CASE WHEN DATEDIFF("dd", LAG(enddate) OVER (ORDER BY startdate), startdate) > 1 THEN 1 ELSE 0 END AS diff 
        FROM <Your Table>
    ) t
) t
GROUP BY cat

What happened?

The above code uses window functions to detect the previous end date and checks the difference between it and the current start date and place 1 if there's a difference otherwise a 0 is placed. The table resulted should be like the below:

startdate   enddate     diff
2014-01-01  2014-01-07  0
2014-01-08  2014-01-14  0
2014-01-15  2014-01-21  0
2014-01-28  2014-02-03  1
2014-02-04  2014-02-10  0
2014-02-11  2014-03-04  0
2014-03-14  2014-03-21  1
2014-04-01  2014-05-10  1

Using SUM function as a window function to sum from the current row till the first row, the result will be:

startdate   endDate   diff  cat
2014-01-01  2014-01-07  0   0
2014-01-08  2014-01-14  0   0
2014-01-15  2014-01-21  0   0
2014-01-28  2014-02-03  1   1
2014-02-04  2014-02-10  0   1
2014-02-11  2014-03-04  0   1
2014-03-14  2014-03-21  1   2
2014-04-01  2014-05-10  1   3

You can then aggregate them easily by getting MIN(startdate) and MAX(enddate) for each cat .

You can do this without a recursive CTE. You just need to identify the records that begin each sequence and then use aggregation.

with cte as (
      select t.*, (case when tprev.startdate is null then 1 else 0 end) as IsSeqStart
      from table t left join
           table tprev
           on t.startdate = dateadd(day, 1, tprev.enddate)
     )
select min(startdate) as startdate, max(enddate) as enddate
from (select cte.*, sum(isSeqStart) over (order by startdate) as grp
      from cte
     ) t
group by grp

This solution assumes it is possible for contiguous date ranges to overlap

declare @t table (startDate date, endDate date)

insert into @t
values 
    ('01-01-2014', '01-07-2014'),
    ('01-08-2014', '01-14-2014'),
    ('01-15-2014', '01-21-2014'),
    ('01-28-2014', '02-03-2014'),
    ('02-04-2014', '02-10-2014')

;with cte 
as (select startdate, enddate from @t
    union all
    select cte.startdate, t.enddate
    from cte 
        inner join @t t
            on t.startdate between dateadd(dd,1,cte.startDate) and dateadd(dd,1,cte.enddate)
            and t.endDate > cte.endDate
    )
select min(startdate) startdate, enddate
from
    (select startDate, max(enddate) enddate 
    from cte 
    group by startdate) a
group by enddate

Also you can try the following approach

 ;WITH cte AS
 (
  SELECT ROW_NUMBER() OVER(ORDER BY t1.startDate) AS Id, t1.StartDate
  FROM dbo.test104 t1 LEFT JOIN dbo.test104 t2 ON DATEADD(DAY, -1, t1.startDate) = t2.endDate
  WHERE t2.endDate IS NULL
  ), cte2 AS
  ( 
   SELECT ROW_NUMBER() OVER(ORDER BY t1.EndDate) AS Id, t1.EndDate
   FROM dbo.test104 t1 LEFT JOIN dbo.test104 t2 ON DATEADD(DAY, 1, t1.EndDate) = t2.StartDate
   WHERE t2.startDate IS NULL
   )
   SELECT *
   FROM cte c1 JOIN cte2 c2 ON c1.Id = c2.Id

Look at SQLFiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM