So far I was using this method from professor Hyndman when I had multiple time series to forecast. But when I have a large number of ts it is fairly slow.
Now I am trying to use apply()
function as follows
library(forecast)
fc_func <- function(y){
forecast(auto.arima(y),h=12)$mean
}
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
retail <- ts(retail[,-1],f=12,s=1982+3/12)
frc<- apply(retail,2 ,fc_func)
It seem that it is working well but when I use for
loop as following:
ns <- ncol(retail)
h <- 12
fcast <- matrix(NA,nrow=h,ncol=ns)
for(i in 1:ns){
fcast[,i] <- forecast(auto.arima(retail[,i]),h=h)$mean
}
I get different point forecast. What is the reason?
Edit: I fixed it by changing the "fc_func" function. Now it returns the same results as for
loop but now it is also as slow as for
loop
fc_func <- function(x){
ts(x,f=12,s=1982+3/12)->y
forecast(auto.arima(y),h=12)$mean
}
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
retail <- ts(retail[,-1],f=12,s=1982+3/12)
frc<- apply(retail,2 ,fc_func)
For debugging i've added some prints in the apply. The interesting one is the class(y)
library(forecast)
fc_func <- function(y){
print(length(y))
print(class(y))
#print(y)
forecast(auto.arima(y),h=12)$mean
}
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
retail <- ts(retail[,-1],f=12,s=1982+3/12)
retail2 = retail
#retail = retail2[1:333,1:42]
frc<- apply(retail,2 ,fc_func)
All the y arrive as numeric at apply.
> frc<- apply(retail,2 ,fc_func)
[1] 333
[1] "numeric"
[1] 333
[1] "numeric"
[1] 333
[1] "numeric"
[1] 333
[1] "numeric"
[1] 333
This is different in the for-loop:
ns <- ncol(retail)
h <- 12
fcast1 <- matrix(NA,nrow=h,ncol=ns)
for(i in 1:ns){
print(length(retail[,i]))
print(class(retail[,i]))
#print(retail[,i])
fcast1[,i] <- forecast(auto.arima(retail[,i]),h=h)$mean
}
here the variables are delivered as ts to auto.arima.
> for(i in 1:ns){
+ print(length(retail[,i]))
+ print(class(retail[,i]))
+ #print(retail[,i])
+ fcast1[,i] <- forecast(auto.arima(retail[,i]),h=h)$mean
+ }
[1] 333
[1] "ts"
[1] 333
[1] "ts"
[1] 333
[1] "ts"
[1] 333
I guess this causes the differences, because when i reduce retail to a simple matrix by
retail = retail[1:NROW(retail), 1:NCOL(retail)]
and run the for-loop again i get perfectly the same results as in the apply version.
all.equal(frc, fcast1)
So i guess you have to transform the variables to ts within the the fc_func again before sending them into the forecast function.
As a workaround (and because i had no idea how to transform y into the desired ts object) you could use an sapply version:
fc_func2 <- function(y){
forecast(auto.arima(retail[,y]),h=12)$mean
}
frc2 <- sapply(1:NCOL(retail), fc_func2)
It should give the desired values, but im not sure if it is any faster than the loop-version.
The issue is apply()
manipulating the class of the time series
object, retail . Being the rudimentary version of the apply family, apply()
is best used for simple matrix objects. It will cast its input to a matrix object with as.matrix()
when called and hence why apply()
is often warned not to be used for data frames.
Per the ?apply
docs:
If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (eg, a data frame) or via as.array
So apply
does not preserve the class object of its input before being processed into fc_func
:
class(retail)
# [1] "mts" "ts" "matrix"
One can see this when using sapply
which runs just as slow as for
and in removing dimnames
returns exactly as for
loop:
# LOOP VERSION
ns <- ncol(retail)
h <- 12
fcast1 <- matrix(NA,nrow=h,ncol=ns)
for(i in 1:ns) {
fcast1[,i] <- forecast(auto.arima(retail[,i]), h=h)$mean
}
# SAPPLY VERSION
frc_test <- sapply(retail, fc_func, USE.NAMES = FALSE)
dimnames(frc_test) <- NULL
identical(frc_test, fcast1)
# [1] TRUE
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.