简体   繁体   中英

Time Difference between per person between consecutive rows

I have some data which (broadly speaking) consist of following fields:

Person  TaskID   Start_time                      End_time
Alpha   1       'Wed, 18 Oct 2017 10:10:03 GMT' 'Wed. 18 Oct 2017 10:10:36 GMT'
Alpha   2       'Wed, 18 Oct 2017 10:11:16 GMT' 'Wed, 18 Oct 2017 10:11:28 GMT'
Beta    1       'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:12:49 GMT'
Alpha   3       'Wed, 18 Oct 2017 10:12:03 GMT' 'Wed, 18 Oct 2017 10:13:13 GMT'
Gamma   1       'Fri, 27 Oct 2017 22:57:12 GMT' 'Sat, 28 Oct 2017 02:00:54 GMT'
Beta    2       'Wed, 18 Oct 2017 10:13:40 GMT' 'Wed, 18 Oct 2017 10:14:03 GMT'

For this data, my required output is something like:

Person  TaskID Time_between_attempts
Alpha   1      NULL      ['Wed, 18 Oct 2017 10:10:03 GMT' - NULL]
Alpha   2      0:00:40   ['Wed, 18 Oct 2017 10:11:16 GMT' -'Wed, 18 Oct 2017 10:10:36 GMT']
Beta    1      NULL      ['Wed, 18 Oct 2017 10:12:03 GMT' - NULL]
Alpha   3      0:00:35   ['Wed, 18 Oct 2017 10:12:03 GMT' -'Wed, 18 Oct 2017 10:11:28 GMT']
Gamma   1      NULL      ['Fri, 27 Oct 2017 22:57:12 GMT' - NULL]
Beta    2      0:00:51   ['Wed, 18 Oct 2017 10:13:40 GMT' -'Wed, 18 Oct 2017 10:12:49 GMT']

My requirements are as below:

a. For a given person (Alpha, Beta or Gamma), the first occurrence of the variable 'time_between_attempts' would be zero/NULL - in the example I have shown it as NULL.

b. The second (and the subsequent) times, the same person appears will have a non NULL or non-zero 'time_between_attempts'. This variable is calculated by taking the difference between the ending time of the previous task and the starting time of the next task.

I have following question in this regard:

  1. How to write a SQL script which can help me achieve the desired output?

Please note that the TaskID is written as integer just for simplification. In the original data, TaskID is complicated and consists of non-continuous strings as:

'q:1392763916495:441',
'q:1392763916495:436'

Any advice on this would be greatly appreciated.

This answers the original version of the question.

You can use lag() and timestampdiff() for the calculation. Assuming your value is a real date/time or timestamp, then you can easily calculate the value in seconds:

select t.*,
       timestampdiff(start_time,
                     lag(end_time) over (partition by person_id order by start_time)
                     seconds
                    )
from t;

If the values are stored as string, fix the data, In the meantime, you can use str_to_date() in the function.

To get this as a time value:

select t.*,
       (time(0) +
        interval timestampdiff(start_time,
                               lag(end_time) over (partition by person_id order by start_time)
                               seconds
                              ) second
       )
from t;

Using self Join() method.

    SELECT a.person, 
            a.taskid, 
            TIMEDIFF (DATE_FORMAT(STR_TO_DATE(a.Start_time, '%a, %d %b %Y %H:%i:%s'), '%Y-%m-%d %H:%i:%s') ,DATE_FORMAT(STR_TO_DATE(b.End_time, '%a, %d %b %Y %H:%i:%s'), '%Y-%m-%d %H:%i:%s') ) as Time_between_attempts,
            a.Start_time,
            b.End_time

        FROM   test a 
            LEFT JOIN test b 
                    ON a.person = b.person 
                        AND a.taskid = b.taskid + 1 
        ORDER  BY 1, 2; 

But this will ignore timezone.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM