简体   繁体   English

计算行对的出现

[英]Count occurrences of row couples

I have a table that is event driven, ie when an event occurs it gets updated. 我有一个由事件驱动的表,即事件发生时它被更新。 When the event 'start' comes in it records a persons location, when 'end' comes in it does not. 当事件“开始”到来时,它记录人员位置,而事件“结束”到来时,它不记录人员位置。

I want to count the number of Ends but report their corresponding locations which is recorded in their 'Start' event. 我想计算结束次数,但报告其在“开始”事件中记录的相应位置。

Note: there are other types of events which i want to ignore. 注意:还有其他类型的事件我想忽略。

Table

drop table Events; 
CREATE TABLE Events (
    EventName       VARCHAR(10) NOT NULL, 
    EventPersonName VARCHAR(50) NOT NULL, 
    EventPersonLocation VARCHAR(50) NULL, 
    EventDate       DATETIME2(0) NULL
);

INSERT  Events 
SELECT 'end', 'bob', 'Null', '2014-05-27 08:00' UNION ALL 
SELECT 'end', 'sally', 'null', '2014-05-27 07:00' UNION ALL 
SELECT 'Start', 'sally', 'Sydney', '2014-05-27 06:30' UNION ALL

SELECT 'start', 'bob', 'Belfast', '2014-05-27 06:00' UNION ALL 
SELECT 'end', 'sally', 'null', '2014-05-27 05:00' UNION ALL 
SELECT 'start', 'jack', 'London', '2014-05-27 04:00' UNION ALL 
SELECT 'end', 'john', 'null', '2014-05-27 03:00' UNION ALL 
SELECT 'start', 'sally', 'New Yourk', '2014-05-27 02:00' UNION ALL 
SELECT 'start', 'john', 'Dublin', '2014-05-27 01:00';

How can i find what values completed since 2014/05/27 00:30 where the result would be; 我如何找到自2014/05/27 00:30之后完成的值,结果将是什么;

John, Dublin
Sally, New York
Sally, Sydney
Bob, Belfast

I suspect i have to join the table to itself and this will give me 1 line for each the start and end then i can simply take the details i need but what about starts with no ends and ends with no starts (due to time filter) 我怀疑我必须将表自身连接起来,这将为我的开始和结束分别提供1行,然后我可以简单地获取所需的详细信息,但是从无结束开始到无开始结束(由于时间过滤)

This query gives you the results you want: 该查询为您提供所需的结果:

SELECT
  s.eventPersonName,
  s.eventPersonLocation,
  s.eventDate AS startDate,
  e.eventDate AS endDate
FROM events e
JOIN events s ON
  s.eventPersonName=e.eventPersonName AND
  s.eventName      ='start'           AND
  s.eventDate = (
    SELECT MAX(p.eventDate)
    FROM events p
    WHERE
      p.eventPersonName=e.eventPersonName AND
      p.eventDate<e.eventDate)
WHERE e.eventName='end';

I have tested it on SQLFiddle . 我已经在SQLFiddle对其进行了测试。

Considerations: 注意事项:

This query will consider only those events that respect the start-end expected sequence. 此查询将仅考虑那些遵守起始-结束预期序列的事件。 So if for some person you have partial data (like start-end-end-start) it will ignore ends immediately preceded by ends and starts immediately followed by starts. 因此,如果对于某些人来说,您拥有部分数据(例如start-end-end-start),它将忽略结尾在结尾之前的结尾,并在结尾之后立即开始。 If can be made to behave differently, but this would seem to me like a good enough approach. 如果可以使其行为有所不同,但是在我看来,这似乎是一种足够好的方法。

This query can do some strange things if you have events for the same person with the same datetime. 如果您有相同日期时间的同一个人的事件,此查询可能会做一些奇怪的事情。 It contains a JOIN on MAX(eventDate) and this can produce multiple rows in such a case. 它在MAX(eventDate)上包含一个JOIN,在这种情况下可以产生多行。

Try this (it shows only those persons who have finished their events: SUM of events = 0): 尝试以下操作(它仅显示完成事件的人员:事件总和= 0):

Updated solution: 更新的解决方案:

DECLARE @Events TABLE (
    EventName       VARCHAR(10) NOT NULL, 
    EventPersonName VARCHAR(50) NOT NULL, 
    EventPersonLocation VARCHAR(50) NULL, 
    EventDate       DATETIME2(0) NULL
);
INSERT  @Events
SELECT 'end', 'bob', null, '2014-05-27 08:00' UNION ALL 
SELECT 'end', 'sally', null, '2014-05-27 07:00' UNION ALL 
SELECT 'Start', 'sally', 'Sydney', '2014-05-27 06:30' UNION ALL

SELECT 'start', 'bob', 'Belfast', '2014-05-27 06:00' UNION ALL 
SELECT 'end', 'sally', null, '2014-05-27 05:00' UNION ALL 
SELECT 'start', 'jack', 'London', '2014-05-27 04:00' UNION ALL 
SELECT 'end', 'john', null, '2014-05-27 03:00' UNION ALL 
SELECT 'start', 'sally', 'New Yourk', '2014-05-27 02:00' UNION ALL 
SELECT 'start', 'john', 'Dublin', '2014-05-27 01:00';


SELECT  y.EventPersonName, 
        y.EventNum,
        MIN(y.EventDate) AS StartDate,
        MAX(y.EventDate) AS EndDate,
        MAX(y.EventPersonLocation) AS EventPersonLocation
FROM
(
    SELECT  x.EventPersonName,
            x.EventDate,
            x.EventPersonLocation,
            SUM(CASE WHEN x.EventName = 'start' THEN +1 WHEN x.EventName = 'end' THEN -1 ELSE 1/0 END) OVER(PARTITION BY x.EventPersonName) AS SumOfEvents,
            (ROW_NUMBER() OVER(PARTITION BY x.EventPersonName ORDER BY x.EventDate ASC) + 1) / 2 AS EventNum
    FROM    @Events x
) y
WHERE   y.SumOfEvents = 0 -- Only finished events
GROUP BY y.EventPersonName, y.EventNum
ORDER BY EventPersonName, y.EventNum;

Output: 输出:

EventPersonName EventNum StartDate              EndDate                EventPersonLocation
--------------- -------- ---------------------- ---------------------- -------------------
bob             1        2014-05-27 06:00:00    2014-05-27 08:00:00    Belfast
john            1        2014-05-27 01:00:00    2014-05-27 03:00:00    Dublin
sally           1        2014-05-27 02:00:00    2014-05-27 05:00:00    New Yourk
sally           2        2014-05-27 06:30:00    2014-05-27 07:00:00    Sydney

If you want to show only the names of persons then you could use: 如果只想显示人员姓名,则可以使用:

SELECT  y.EventPersonName
FROM (
    SELECT  x.EventPersonName, 
            EventWithSign = CASE WHEN x.EventName = 'start' THEN +1 WHEN x.EventName = 'end' THEN -1 ELSE 1/0 END
    FROM    @Events x
) y
GROUP BY y.EventPersonName
HAVING  SUM(y.EventWithSign) = 0

A very simple approach would be to say: take each start where a later end event for the person exists: 一种非常简单的方法是说:在每个人都有一个较晚的结束事件的地方开始:

SELECT EventPersonName, EventPersonLocation 
FROM Events start_events
WHERE EventName = 'start'
AND EXISTS
(
  SELECT *
  FROM Events end_events
  WHERE end_events.EventPersonName = start_events.EventPersonName
  AND end_events.EventDate > start_events.EventDate
  AND end_events.EventName = 'end'
);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM