简体   繁体   English

在查询中查找重叠的时间戳

[英]Finding overlapping timestamps in query

I have an issue with removing or tagging overlapping timestamps grouped by certain ID.我在删除或标记按特定 ID 分组的重叠时间戳时遇到问题。

Times can overlap in nest and may have same start time or end time.时间可以在嵌套中重叠,并且可以具有相同的开始时间或结束时间。
If second time starts before previous time has ended it will end before or at the same time as previous time.如果第二次在上一次结束之前开始,它将在上一次之前或与上一次同时结束。 No time differences will go over 12 hours. go 不会有超过 12 小时的时差。

Using T-SQL.使用 T-SQL。

Sample data:样本数据:

ID  task_id starttime                       endtime
11  1       2023-01-10 06:31:00.000         2023-01-10 08:53:00.000
11  1       2023-01-10 08:00:00.000         2023-01-10 08:53:00.000
11  2       2023-01-10 13:14:00.000         2023-01-10 15:15:00.000
11  2       2023-01-10 15:46:00.000         2023-01-10 17:59:00.000
11  2       2023-01-10 18:49:00.000         2023-01-10 18:50:00.000
12  3       2023-01-09 10:10:00.000         2023-01-09 11:10:00.000
12  3       2023-01-09 10:10:00.000         2023-01-09 10:50:00.000
13  4       2023-01-08 20:00:00.000         2023-01-09 03:44:00.000
13  4       2023-01-08 21:00:00.000         2023-01-09 02:00:00.000
14  5       2023-01-01 19:23:00.000         2023-01-01 20:47:00.000
14  5       2023-01-02 03:35:00.000         2023-01-02 06:57:00.000

Desired result:期望的结果:

ID  task_id starttime                       endtime
11  1       2023-01-10 06:31:00.000         2023-01-10 08:53:00.000
11  2       2023-01-10 13:14:00.000         2023-01-10 15:15:00.000
11  2       2023-01-10 15:46:00.000         2023-01-10 17:59:00.000
11  2       2023-01-10 18:49:00.000         2023-01-10 18:50:00.000
12  3       2023-01-09 10:10:00.000         2023-01-09 11:10:00.000
13  4       2023-01-08 20:00:00.000         2023-01-09 03:44:00.000
14  5       2023-01-01 19:23:00.000         2023-01-01 20:47:00.000
14  5       2023-01-02 03:35:00.000         2023-01-02 06:57:00.000

I've tried methods with lead or lag functions but it doesn't seem to play well with edge cases.我尝试过具有领先或滞后功能的方法,但它似乎不能很好地处理边缘情况。 For example:例如:

case when lead(starttime) over (partition by task_id order by starttime) <> endtime then 1 else 0 end as overlap_tag

Doesn't count the time in ID 11 task_id 2 from 18:49-18:50 as not overlapping and doesn't seem to take into account the day changing.不将 ID 11 task_id 2 中的时间从 18:49-18:50 计算为不重叠,并且似乎没有考虑到日期的变化。

I only tested it on PostgreSQL, but it might help.我只在 PostgreSQL 上测试过,但它可能会有所帮助。


Conditions情况

Preparation准备

CREATE TABLE task_duration (
    id INTEGER, 
    task_id INTEGER,
    start_time TIMESTAMP, 
    end_time TIMESTAMP
);

INSERT INTO task_duration VALUES (11, 1, '2023-01-10 06:31:00.000', '2023-01-10 08:53:00.000');
INSERT INTO task_duration VALUES (11, 1, '2023-01-10 08:00:00.000', '2023-01-10 08:53:00.000');
INSERT INTO task_duration VALUES (11, 2, '2023-01-10 13:14:00.000', '2023-01-10 15:15:00.000');
INSERT INTO task_duration VALUES (11, 2, '2023-01-10 15:46:00.000', '2023-01-10 17:59:00.000');
INSERT INTO task_duration VALUES (11, 2, '2023-01-10 18:49:00.000', '2023-01-10 18:50:00.000');
INSERT INTO task_duration VALUES (12, 3, '2023-01-09 10:10:00.000', '2023-01-09 11:10:00.000');
INSERT INTO task_duration VALUES (12, 3, '2023-01-09 10:10:00.000', '2023-01-09 10:50:00.000');
INSERT INTO task_duration VALUES (13, 4, '2023-01-08 20:00:00.000', '2023-01-09 03:44:00.000');
INSERT INTO task_duration VALUES (13, 4, '2023-01-08 21:00:00.000', '2023-01-09 02:00:00.000');
INSERT INTO task_duration VALUES (14, 5, '2023-01-01 19:23:00.000', '2023-01-01 20:47:00.000');
INSERT INTO task_duration VALUES (14, 5, '2023-01-02 03:35:00.000', '2023-01-02 06:57:00.000');

Query询问

SELECT id, 
    task_id, 
    start_time, 
    end_time
FROM (
    SELECT id, 
        task_id, 
        start_time, 
        end_time, 
        LAG(start_time) OVER (PARTITION BY task_id ORDER BY task_id, start_time, end_time DESC) AS prev_start_time, 
        LAG(end_time) OVER (PARTITION BY task_id ORDER BY task_id, start_time, end_time DESC) AS prev_end_time
    FROM task_duration
) v
WHERE prev_start_time IS NULL  -- 1st condition
    OR NOT (v.end_time >= v.prev_start_time AND v.start_time <= v.prev_end_time);  -- 2nd condition

Result结果

id|task_id|start_time             |end_time               |
--+-------+-----------------------+-----------------------+
11|      1|2023-01-10 06:31:00.000|2023-01-10 08:53:00.000|
11|      2|2023-01-10 13:14:00.000|2023-01-10 15:15:00.000|
11|      2|2023-01-10 15:46:00.000|2023-01-10 17:59:00.000|
11|      2|2023-01-10 18:49:00.000|2023-01-10 18:50:00.000|
12|      3|2023-01-09 10:10:00.000|2023-01-09 11:10:00.000|
13|      4|2023-01-08 20:00:00.000|2023-01-09 03:44:00.000|
14|      5|2023-01-01 19:23:00.000|2023-01-01 20:47:00.000|
14|      5|2023-01-02 03:35:00.000|2023-01-02 06:57:00.000|

Try this https://dbfiddle.uk/id_waBN _ By using recursive query you make sure you cover any number of overlapping intervals.试试这个https://dbfiddle.uk/id_waBN _ 通过使用递归查询,您可以确保覆盖任意数量的重叠间隔。 You start by the rows not intersected by previous one.您从与前一行不相交的行开始。

with task_duration_wrn(id, task_id, starttime, endtime, rn, act) as (
    select id, task_id, starttime, endtime, 
        rank() over(partition by id, task_id order by starttime, endtime) as rn,
        cast( 
        case when 
         starttime <= lag(endtime) over(partition by id, task_id order by starttime, endtime) 
        then 'PACK' end as VARCHAR(4))
    from task_duration
),
cte(id, task_id, starttime, endtime, rn, lvl, act) as (
    select d.id, d.task_id, d.starttime, d.endtime, d.rn, 1, 
    CAST(NULL AS VARCHAR(4))
    from task_duration_wrn d
    where act is NULL
    
    union all
    
    select d.id, d.task_id, c.starttime, d.endtime, d.rn, c.lvl+1, d.act
    from cte c
    join task_duration_wrn d on c.id = d.id and c.task_id = d.task_id and 
        c.lvl+1 = d.rn
    where d.act = 'PACK'    
)
select id, task_id, starttime, max(endtime) as endtime
from cte c
group by id, task_id, starttime
order by id, task_id, starttime

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM