简体   繁体   English

时间序列交叉验证的实现

[英]Implementation of time series cross-validation

I am working with time series 551 of the monthly data of the M3 competition.我正在处理 M3 比赛每月数据的时间序列 551。

So, my data is:所以,我的数据是:

library(forecast)
library(Mcomp)
# Time Series
# Subset the M3 data to contain the relevant series 
ts.data<- subset(M3, 12)[[551]]
print(ts.data)

I want to implement time series cross-validation for the last 18 observations of the in-sample interval .我想对样本内间隔的最后 18 个观测值实施时间序列交叉验证

Some people would normally call this “forecast evaluation with a rolling origin” or something similar.有些人通常会称其为“具有滚动起源的预测评估”或类似的东西。

How can i achieve that?我怎样才能做到这一点? Whats means the in-sample interval?什么是样本内间隔? Which is the timeseries i must evaluate?我必须评估哪些时间序列?

Im quite confused, any help in order to light up this would be welcome.我很困惑,欢迎任何有助于点亮它的帮助。

The tsCV function of the forecast package is a good place to start. forecast package 的tsCV function 是一个很好的起点。

From its documentation,从其文档中,

tsCV(y, forecastfunction, h = 1, window = NULL, xreg = NULL, initial = 0, . ..) tsCV(y,预测函数,h = 1,window = NULL,xreg = NULL,初始 = 0,...)

Let 'y' contain the time series y[1:T].让 'y' 包含时间序列 y[1:T]。 Then 'forecastfunction' is applied successively to the time series y[1:t], for t=1,...,Th, making predictions f[t+h].然后将“预测函数”连续应用于时间序列 y[1:t],对于 t=1,...,Th,做出预测 f[t+h]。 The errors are given by e[t+h] = y[t+h]-f[t+h].误差由 e[t+h] = y[t+h]-f[t+h] 给出。

That is first tsCV fit a model to the y[1] and then forecast y[1 + h], next fit a model to y[1:2] and forecast y[2 + h] and so on for Th steps.这是首先 tsCV 将 model 拟合到 y[1] 然后预测 y[1 + h],然后将 model 拟合到 y[1:2] 并预测 y[2 + h] 等等。

The tsCV function returns the forecast errors. tsCV function 返回预测误差。

Applying this to the training data of the ts.data将此应用于ts.data的训练数据

# function to fit a model and forecast
fmodel <- function(x, h){
  forecast(Arima(x, order=c(1,1,1), seasonal = c(0, 0, 2)), h=h)
}
 
# time-series CV
cv_errs <- tsCV(ts.data$x, fmodel, h = 1)

# RMSE of the time-series CV
sqrt(mean(cv_errs^2, na.rm=TRUE))
# [1] 778.7898

In your case, it maybe that you are supposed to就您而言,也许您应该

  1. fit a model to ts.data$x and then forecast ts.data$xx[1]将 model 安装到 ts.data$x 然后预测 ts.data$xx[1]
  2. fit mode the c(ts.data$x, ts.data$xx[1]) and forecast(ts.data$xx[2]), so on.拟合模式 c(ts.data$x, ts.data$xx[1]) 和预测(ts.data$xx[2]) 等等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM