简体   繁体   中英

how to convert dataframe into hourly time series in R

I have following dataframe in R

   hourly_calls                 total_calls
   2017-12-01 08:00-08:59       39
   2017-12-01 09:00-09:59       29
   2017-12-01 10:00-10:59       57
   2017-12-01 11:00-11:59       90
   2017-12-01 12:00-12:59       23
   2017-12-01 13:00-13:59       45
   2017-12-01 14:00-14:59       54
   2017-12-01 15:00-15:59       39
   2017-12-01 16:00-16:59       29
   2017-12-01 17:00-17:00       27
   2017-12-04 08:00-08:59       49
   2017-12-04 09:00-09:59       69
   2017-12-04 10:00-10:59       27
   2017-12-04 11:00-11:59       60
   2017-12-04 12:00-12:59       23
   2017-12-04 13:00-13:59       85
   2017-12-04 14:00-14:59       14
   2017-12-04 15:00-15:59       39
   2017-12-04 16:00-16:59       59
   2017-12-04 17:00-17:00       67

This is the dataframe of call centers call volume of every hour (9 hours shift/5 days a week). I want to convert this dataframe into hourly time series,so that I can forecast it for next hours.

This is how I am doing it

 train <- df[1:1152,]
 test < df[1153:1206,]
 train <- msts(train[['total_calls']], seasonal.periods=c(9))
 test <- msts(test[['total_calls']], seasonal.periods=c(9))

How can I do it in r?

The main problem in your data is that first column hourly_calls represent range of time rather than just time. Hence, it wont get converted to date-time automatically to prepare a ts . One option is to just consider Start Time part and prepare your time series.

library(tidyverse)
library(lubridate)
library(xts)
library(forecast)


#Get the start time first
data <- df %>% extract(hourly_calls, 
c("StartTm", "EndTm"), regex = "(^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2})-(\\d{2}:\\d{2})") %>%
  mutate(StartTm = ymd_hm(StartTm))

#Only StartTm has been considered for this  
xtsData <- xts(data$total_calls, order.by = data$StartTm)

train <- xtsData[1:1152,]
test <- xtsData[1153:1206,]

trainTS <- ts(train, freq=9) #9 hours a day
fit <- tslm(trainTS ~ season + trend) 

forecast(fit, newdata = data.frame(x=test))

Data:

df <- read.table(text =
"hourly_calls                 total_calls
'2017-12-01 08:00-08:59'       39
'2017-12-01 09:00-09:59'       29
'2017-12-01 10:00-10:59'       57
'2017-12-01 11:00-11:59'       90
'2017-12-01 12:00-12:59'       23
'2017-12-01 13:00-13:59'       45
'2017-12-01 14:00-14:59'       54
'2017-12-01 15:00-15:59'       39
'2017-12-01 16:00-16:59'       29
'2017-12-01 17:00-17:00'       27
'2017-12-04 08:00-08:59'       49
'2017-12-04 09:00-09:59'       69
'2017-12-04 10:00-10:59'       27
'2017-12-04 11:00-11:59'       60
'2017-12-04 12:00-12:59'       23
'2017-12-04 13:00-13:59'       85
'2017-12-04 14:00-14:59'       14
'2017-12-04 15:00-15:59'       39
'2017-12-04 16:00-16:59'       59
'2017-12-04 17:00-17:00'       67",
header = TRUE, stringsAsFactors = FALSE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM