简体   繁体   中英

Calculate time difference by condition

My data contains start and finish times for workers on their shifts. I wish to know the duration of each shift, according to each worker.

The dataset is quite large, many workers and many shifts, so here is a small example:

           TimeStart          TimeFinish ShiftNo       Worker
               <dttm>              <dttm>  <fctr>       <fctr>
1 2017-04-10 00:06:18 2017-04-10 00:06:19      S1 Caleb 
2 2017-04-10 00:19:56 2017-04-10 00:20:16      S1 Caleb 
3 2017-04-10 00:00:00 2017-04-10 00:00:20      S2 Caleb 
4 2017-04-10 00:08:32 2017-04-10 00:08:52      S2 Caleb 
5 2017-04-10 00:25:35 2017-04-10 00:25:55      S2 Caleb 
6 2017-04-10 00:00:00 2017-04-10 00:00:19      S3 Caleb 

I wish to calculate the length of each shift, by subtracting the first entry of TimeStart from the last entry of TimeFinish .

Ideally, I would like to do this in dplyr but I don't think this is the correct code?

ShiftDuration <- df %>%
  group_by(Worker, Shift) %>% 
  summarise(Duration = TimeFinish-TimeStart)

Any help would be greatly appreciated.

You're almost there. Your group_by should be (Worker, ShiftNo) (not Shift, assuming your example data is correct). Presumably you want the minimum start time and maximum finish time, per worker, per shift:

df %>% 
  group_by(Worker, ShiftNo) %>% 
  summarise(Duration = max(TimeFinish) - min(TimeStart))

  Worker ShiftNo      Duration
   <chr>   <chr>        <time>
1  Caleb      S1 13.96667 mins
2  Caleb      S2 25.91667 mins
3  Caleb      S3 19.00000 mins

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM