简体   繁体   English

如何在 T-SQL 中的 24 小时内 Select MIN 和 MAX 日期时间

[英]How to Select MIN and MAX datetimes within a 24 hour period in T-SQL

The record exists in this format:记录以这种格式存在:

+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| StartDTM                    | EndDTM                      | PersonID | PersonName | Duration | TimeSheetItemID |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-17 17:48:00.0000000 | 2019-08-17 18:00:00.0000000 | 111111   | Smith, Bob | 0.200000 | 154446149       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-17 18:00:00.0000000 | 2019-08-17 23:00:00.0000000 | 111111   | Smith, Bob | 5.000000 | 154446149       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-17 23:00:00.0000000 | 2019-08-17 23:30:00.0000000 | 111111   | Smith, Bob | 0.500000 | 154446149       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-17 23:30:00.0000000 | 2019-08-18 00:00:00.0000000 | 111111   | Smith, Bob | 0.500000 | 154446149       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-18 00:00:00.0000000 | 2019-08-18 02:14:00.0000000 | 111111   | Smith, Bob | 2.233333 | 154446149       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-18 02:14:00.0000000 | 2019-08-18 06:18:00.0000000 | 111111   | Smith, Bob | 4.066666 | 154478804       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-25 17:48:00.0000000 | 2019-08-25 18:00:00.0000000 | 111111   | Smith, Bob | 0.200000 | 154745867       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-25 18:00:00.0000000 | 2019-08-25 23:00:00.0000000 | 111111   | Smith, Bob | 5.000000 | 154745867       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-25 23:00:00.0000000 | 2019-08-25 23:30:00.0000000 | 111111   | Smith, Bob | 0.500000 | 154745867       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-25 23:30:00.0000000 | 2019-08-26 00:00:00.0000000 | 111111   | Smith, Bob | 0.500000 | 154745867       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-26 00:00:00.0000000 | 2019-08-26 02:00:00.0000000 | 111111   | Smith, Bob | 2.000000 | 154745867       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+
| 2019-08-26 02:00:00.0000000 | 2019-08-26 05:54:00.0000000 | 111111   | Smith, Bob | 3.900000 | 154756492       |
+-----------------------------+-----------------------------+----------+------------+----------+-----------------+

I need to select the MIN StartDTM and the MAX EndDTM within a 24 hour period.我需要在 24 小时内 select MIN StartDTM 和 MAX EndDTM。 I have tried selecting MIN(StartDTM) and MAX(EndDTM) in combination with GROUP BY PersonName and TimeSheetID, but this fails as sometimes more than one TimeSheetID exists within a 24 hour period (See row 6 above).我尝试将 MIN(StartDTM) 和 MAX(EndDTM) 与 GROUP BY PersonName 和 TimeSheetID 结合使用,但这会失败,因为有时在 24 小时内存在多个 TimeSheetID(参见上面的第 6 行)。

My desired results should look like this:我想要的结果应该是这样的:

+-----------------------------+-----------------------------+----------+------------+-----------------+
| StartDTM                    | EndDTM                      | PersonID | PersonName | TimeSheetItemID |
+-----------------------------+-----------------------------+----------+------------+-----------------+
| 2019-08-17 17:48:00.0000000 | 2019-08-18 06:18:00.0000000 | 111111   | Smith, Bob | 154446149       |
+-----------------------------+-----------------------------+----------+------------+-----------------+
| 2019-08-25 17:48:00.0000000 | 2019-08-26 05:54:00.0000000 | 111111   | Smith, Bob | 154745867       |
+-----------------------------+-----------------------------+----------+------------+-----------------+

Is this possible to achieve in T-SQL?这可以在 T-SQL 中实现吗?

This is gaps-and-islands problem.这是差距和孤岛问题。 You need to find where the islands start.您需要找到岛屿的起点。 In this case, I recommend a cumulative maximum.在这种情况下,我建议使用累积最大值。

select personId, min(startTM), max(endTM)
from (select t.*,
             sum(case when prev_maxEndTm >= dateadd(day, -1, startTm)
                      then 0  -- maximum is later than this record so no new island
                      else 1  -- maximum is earlier so new island
                  end) over (partition by personId order by startTm) as grp
      from (select t.*,
                   max(EndTm) over (partition by personId
                                    order by startTm
                                    rows between unbounded preceding and 1 preceding
                                   ) as prev_maxEndTm
            from t
           ) t
     ) t
group by personId;

If I understand that you want min/max per day , then you need to also group by the day.如果我知道您想要每天的 min/max ,那么您还需要按天分组。 untested未经测试

select StartDTM, EndDTM, startTable.PersonID
   ,startTable.PersonName, startTable.TimeSheetItemID
from (
   select min(StartDTM) StartDTM, PersonID, PersonName, TimeSheetItemID
   from YourTable
   group by convert(date,StartDTM), PersonID, TimeSheetItemID, PersonName
) startTable
full outer join (
   select max(EndDTM) EndDTM, PersonID, TimeSheetItemID
   from YourTable
   group by convert(date,endDTM), PersonID, TimeSheetItemID
) EndTable
   on startTable.PersonID = endTable.PersonID
   and startTable.TimeSheetItemID = endTable.TimeSheetItemID
where convert(date,StartDTM) = convert(date,EndDTM)
order by startTable.PersonID,StartDTM

Timesheets that do not have both a startDTM and endDTM in a given day should have null values in this query.在给定日期没有 startDTM 和 endDTM 的时间表应在此查询中具有 null 值。

If you are interested in any 24 hour period then that's a whole different thing.如果您对任何24 小时时间段感兴趣,那就完全不同了。

You can use lead and lag as below to solve this problem:您可以使用如下的领先和滞后来解决此问题:

;with cte_bucket as (
    select *, sum(difsec) over(partition by personid order by startdtm) bucket from (
        select *, coalesce(ABS(datediff(ss, startdtm, lag(enddtm) over( partition by personid order by startdtm))), 1) difsec
        from #table
    ) a 
)
select min(startdtm), max(enddtm), personid, personname, min(timesheetitemid)  from cte_bucket
group by personid, personname, bucket 

Code for reference:参考代码:

https://rextester.com/UNF89433 https://rextester.com/UNF89433

+----+---------------------+---------------------+----------+------------+-----------------+
|    |      startDTM       |       endDTM        | personid | personname | TimesheetItemId |
+----+---------------------+---------------------+----------+------------+-----------------+
|  1 | 17.08.2019 17:48:00 | 18.08.2019 06:18:00 |   111111 | Smith, Bob |       154446149 |
|  2 | 25.08.2019 17:48:00 | 26.08.2019 05:54:00 |   111111 | Smith, Bob |       154745867 |
+----+---------------------+---------------------+----------+------------+-----------------+ 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM