简体   繁体   中英

SQL: Get an aggregate (SUM) of a calculation of two fields (DATEDIFF) that has conditional logic (CASE WHEN)

I have a dataset that includes a bunch of stay data (at a hotel). Each row contains a start date and an end date, but no duration field. I need to get a sum of the durations.

Sample Data:

| Stay ID | Client ID | Start Date | End Date   |
| 1       | 38        | 01/01/2018 | 01/31/2019 |
| 2       | 16        | 01/03/2019 | 01/07/2019 |
| 3       | 27        | 01/10/2019 | 01/12/2019 |
| 4       | 27        | 05/15/2019 | NULL       |
| 5       | 38        | 05/17/2019 | NULL       |

There are some added complications:

  1. I am using Crystal Reports and this is a SQL Expression, which obeys slightly different rules. Basically, it returns a single scalar value. Here is some more info: http://www.cogniza.com/wordpress/2005/11/07/crystal-reports-using-sql-expression-fields/
  2. Sometimes, the end date field is blank (they haven't booked out yet). If blank, I would like to replace it with the current timestamp.
  3. I only want to count nights that have occurred in the past year. If the start date of a given stay is more than a year ago, I need to adjust it.
  4. I need to get a sum by Client ID

I'm not actually any good at SQL so all I have is guesswork.

The proper syntax for a Crystal Reports SQL Expression is something like this:

(
SELECT (CASE
WHEN StayDateStart < DATEADD(year,-1,CURRENT_TIMESTAMP) THEN DATEDIFF(day,DATEADD(year,-1,CURRENT_TIMESTAMP),ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
ELSE DATEDIFF(day,StayDateStart,ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
END)
)

And that's giving me the correct value for a single row, if I wanted to do this:

| Stay ID | Client ID | Start Date | End Date   | Duration |
| 1       | 38        | 01/01/2018 | 01/31/2019 | 210      | // only days since June 4 2018 are counted
| 2       | 16        | 01/03/2019 | 01/07/2019 | 4        |
| 3       | 27        | 01/10/2019 | 01/12/2019 | 2        |
| 4       | 27        | 05/15/2019 | NULL       | 21       |
| 5       | 38        | 05/17/2019 | NULL       | 19       |

But I want to get the SUM of Duration per client, so I want this:

| Stay ID | Client ID | Start Date | End Date   | Duration |
| 1       | 38        | 01/01/2018 | 01/31/2019 | 229      | // 210+19
| 2       | 16        | 01/03/2019 | 01/07/2019 | 4        |
| 3       | 27        | 01/10/2019 | 01/12/2019 | 23       | // 2+21
| 4       | 27        | 05/15/2019 | NULL       | 23       |
| 5       | 38        | 05/17/2019 | NULL       | 229      |

I've tried to just wrap a SUM() around my CASE but that doesn't work:

(
SELECT SUM(CASE
WHEN StayDateStart < DATEADD(year,-1,CURRENT_TIMESTAMP) THEN DATEDIFF(day,DATEADD(year,-1,CURRENT_TIMESTAMP),ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
ELSE DATEDIFF(day,StayDateStart,ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
END)
)

It gives me an error that the StayDateEnd is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. But I don't even know what that means, so I'm not sure how to troubleshoot, or where to go from here. And then the next step is to get the SUM by Client ID.

Any help would be greatly appreciated!

You need a subquery for sum based on group by client_id and a join between you table the subquery eg:

select  Stay_id, client_id, Start_date, End_date,  t.sum_duration 
from your_table
inner join ( 
    select Client_id, 

    SUM(CASE
    WHEN StayDateStart < DATEADD(year,-1,CURRENT_TIMESTAMP) THEN DATEDIFF(day,DATEADD(year,-1,CURRENT_TIMESTAMP),ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
    ELSE DATEDIFF(day,StayDateStart,ISNULL(StayDateEnd,CURRENT_TIMESTAMP))
    END) sum_duration 

    from your_table  
    group by Client_id 
    ) t on t.Client_id = your_table.client_id

Although the explanation and data set are almost impossible to match, I think this is an approximation to what you want.

declare @your_data table (StayId int, ClientId int, StartDate date, EndDate date)
insert into @your_data values
(1,38,'2018-01-01','2019-01-31'),
(2,16,'2019-01-03','2019-01-07'),
(3,27,'2019-01-10','2019-01-12'),
(4,27,'2019-05-15',NULL),
(5,38,'2019-05-17',NULL)

;with data as (
  select *,
    datediff(day,
      case 
        when datediff(day,StartDate,getdate())>365 then dateadd(year,-1,getdate())
        else StartDate
      end,
      isnull(EndDate,getdate())
    ) days
  from @your_data
)
select *,
  sum(days) over (partition by ClientId)
from data

https://rextester.com/HCKOR53440

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM