简体   繁体   English

将每 2 个连续记录合并为 1 个

[英]Merge every 2 consecutive records into 1

I have a pre-processed table which I want to group every pair into one record containing data from fields of both records.我有一个预处理表,我想将每一对组合成一个记录,其中包含来自两个记录字段的数据。

|-------------------|-----|----|
|Timestamp          |Event|User|
|-------------------|-----|----|
|17/03/2020 03:22:00|Start|1   |
|17/03/2020 03:22:05|End  |1   |
|17/03/2020 03:22:10|Start|2   |
|17/03/2020 03:22:15|End  |2   |
|17/03/2020 03:23:00|Start|1   |
|17/03/2020 03:23:22|End  |1   |
|-------------------|-----|----|

The query should return:查询应返回:

|-------------------|-------------------|----|
|StartTimestamp     |EndTimestamp       |User|
|-------------------|-------------------|----|
|17/03/2020 03:22:00|17/03/2020 03:22:05|1   |
|17/03/2020 03:22:10|17/03/2020 03:22:15|2   |
|17/03/2020 03:23:00|17/03/2020 03:23:22|1   |
|-------------------|-------------------|----|

You can safely assume that every 2 records is the correct pair (events are Start and End respectively, and User is the same) since the table is pre-filtered.您可以放心地假设每 2 条记录都是正确的对(事件分别是 Start 和 End,并且 User 是相同的),因为该表已预先过滤。

EDIT: Sorry, I forgot to mention that having multiple pairs for a single user is allowed.编辑:抱歉,我忘了提到允许单个用户拥有多对。 I've adjusted the example table above to show that.我已经调整了上面的示例表以显示这一点。

As suggested, this should do what you want :正如所建议的,这应该做你想做的:

SELECT
     MIN(Timestamp) AS StartTimestamp,
     MAX(Timestamp) AS EndTimestamp,
     User
FROM 
     mytable
GROUP BY User;

EDIT : As a user id can appear multiple times, in multiple groups, see the following query :编辑:由于用户 ID 可以在多个组中多次出现,请参阅以下查询:

WITH cte AS (
     SELECT mt.*, ROW_NUMBER() OVER(ORDER BY time) AS rn FROM mytable mt
)
SELECT 
     t1.userid,
     t1.time AS StartTimestamp, 
     t2.time AS EndTimestamp
FROM cte t1
JOIN cte t2 ON t1.rn+1 = t2.rn
WHERE t1.event = 'Start'

WITH DEMO HERE 这里有演示

You can use row_number() & do conditional aggregation :您可以使用row_number()并进行条件聚合:

select user, 
       min(case when event = 'Start' then timestamp end) as starttimestamp,
       min(case when event = 'End' then timestamp end) as endtimestamp
from (select t.*, 
             row_number() over (partition by user, event order by timestamp) as seq
      from table t
     ) t
group by user, seq;

I would suggest using lead() or a cumulative min() :我建议使用lead()或累积min()

select t.*
from (select t.*,
             min(case when event = 'End' then timestamp end) over (partition by user order by timestamp desc) as end_time
      from t
     ) t
where event = 'Start';

Number the rows per user and event to get to event numbers.对每个用户和事件的行进行编号以获取事件编号。 Then join event starts with event ends.然后加入事件开始事件结束。

with s as
(
  select
    [user], timestamp,
    row_number() over (partition by [user] order by timestamp) as event_number
  from mytable
  where event = 'Start'
)
, e as
(
  select
    [user], timestamp,
    row_number() over (partition by [user] order by timestamp) as event_number
  from mytable
  where event = 'End'
)
select s.[user], s.timestamp as start_time, e.timestamp as end_time
from s
join e on e.[user] = s.[user] and e.event_number = s.event_number
order by start_time;

Use a left outer join, if you want to show events that have started but not ended yet.如果要显示已开始但尚未结束的事件,请使用左外连接。

This query also allows for parallel events (ie a user starts an event, then another user starts an event before the first user ends theirs).该查询还允许并行事件(即,一个用户开始一个事件,然后另一个用户在第一个用户结束他们的事件之前开始一个事件)。

What the query doesn't account for are missing events, eg a user starts an event, but when they end it, it's not recorded in the table.查询没有考虑到丢失的事件,例如用户开始一个事件,但是当他们结束它时,它没有记录在表中。 Then the user starts a new event and end it and my query will relate the second event's end with the first event's start.然后用户开始一个新事件并结束它,我的查询会将第二个事件的结束与第一个事件的开始联系起来。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM