Finding difference in time between two data frames using R

Question

I have two data frame ,one is the in time of employees and the other is the out time of employees.The data in both the data frames have timestamps for about 4000 employees in the last one year(excludes weekend/public holiday dates).Each data frame has 4000 rows and 250 columns.I would like to find the number of hours spent by an employee each day at work basically my approach would be to find the difference in time between the two data frames using difftime() function.i used the below code and expected a resulting data frame containing 4000 rows and 250 columns with difference in time,however the data was returned in one single column.How should I deal with this problem so that I can get the difference in time between two data frames in the data frame format with 4000 rows and 250 columns?

hours_spent <- as.data.frame(as.matrix(difftime(as.matrix(out_time_data_hrs),as.matrix(in_time_data_hrs),unit='hour')))

Input data looks like below ,

In_time data frame

Out_time data frame

Expected output

Answer 1

Here's a small and simple example based on the data you posted and a possible solution:

# example data in_times
df1 = data.frame(`2018-08-01` = c("2018-08-01 10:30:00", "2018-08-01 10:25:00"),
                 `2018-08-02` = c("2018-08-02 10:20:00", "2018-08-02 10:45:00"))
# example data out_times
df2 = data.frame(`2018-08-01` = c("2018-08-01 17:33:00", "2018-08-01 18:06:00"),
                 `2018-08-02` = c("2018-08-02 17:11:00", "2018-08-02 17:45:00"))

library(tidyverse)

# reshape datasets
df1_resh = df1 %>%
  mutate(empl_id = row_number()) %>%   # add an employee id (using the row number)
  gather(day, in_time, -empl_id)       # reshape dataset

df2_resh = df2 %>%
  mutate(empl_id = row_number()) %>%
  gather(day, out_time, -empl_id)

# join datasets and calculate hours spent
left_join(df1_resh, df2_resh, by=c("empl_id","day")) %>%
  mutate(hours_spent = difftime(out_time, in_time))

#   empl_id         day             in_time            out_time    hours_spent
# 1       1 X2018.08.01 2018-08-01 10:30:00 2018-08-01 17:33:00 7.050000 hours
# 2       2 X2018.08.01 2018-08-01 10:25:00 2018-08-01 18:06:00 7.683333 hours
# 3       1 X2018.08.02 2018-08-02 10:20:00 2018-08-02 17:11:00 6.850000 hours
# 4       2 X2018.08.02 2018-08-02 10:45:00 2018-08-02 17:45:00 7.000000 hours

You can use this as the final piece of code if you want to reshape back to your initial format:

left_join(df1_resh, df2_resh, by=c("empl_id","day")) %>%
  mutate(hours_spent = difftime(out_time, in_time)) %>%
  select(empl_id, day, hours_spent) %>%
  spread(day, hours_spent)

#   empl_id    X2018.08.01 X2018.08.02
# 1       1 7.050000 hours  6.85 hours
# 2       2 7.683333 hours  7.00 hours

Answer 2

我的要求可以满足，只需做下面的事情就可以了

employee_hrs_df <- out_time_data - in_time_data

Finding difference in time between two data frames using R

Question

2 answers

solution1
2 2018-08-19 12:56:48

solution2
0 2018-08-20 00:23:19

Finding difference in time between two data frames using R

Question

2 answers

solution1 2 2018-08-19 12:56:48

solution2 0 2018-08-20 00:23:19

solution1
2 2018-08-19 12:56:48

solution2
0 2018-08-20 00:23:19