简体   繁体   English

解释ARIMA模型的预测

[英]Explaining the forecasts from an ARIMA model

I am trying to explain to myself the forecasting result from applying an ARIMA model to a time-series dataset. 我试图向自己解释将ARIMA模型应用于时间序列数据集的预测结果。 The data is from the M1-Competition, the series is MNB65. 数据来自M1-Competition,系列是MNB65。 I am trying to fit the data to an ARIMA(1,0,0) model and get the forecasts. 我试图将数据拟合到ARIMA(1,0,0)模型并获得预测。 I am using R. Here are some output snippets: 我正在使用R.以下是一些输出片段:

> arima(x, order = c(1,0,0))
Series: x 
ARIMA(1,0,0) with non-zero mean 
Call: arima(x = x, order = c(1, 0, 0)) 
Coefficients:
         ar1  intercept
      0.9421  12260.298
s.e.  0.0474    202.717

> predict(arima(x, order = c(1,0,0)), n.ahead=12)
$pred
Time Series:
Start = 53 
End = 64 
Frequency = 1 
[1] 11757.39 11786.50 11813.92 11839.75 11864.09 11887.02 11908.62 11928.97 11948.15 11966.21 11983.23 11999.27

I have a few questions: 我有几个问题:

(1) How do I explain that although the dataset shows a clear downward trend, the forecast from this model trends upward? (1)我如何解释虽然数据集显示出明显的下降趋势,但该模型的预测趋势向上? This also happens for ARIMA(2,0,0), which is the best ARIMA fit for the data using auto.arima (forecast package) and for an ARIMA(1,0,1) model. 这也适用于ARIMA(2,0,0),这是使用auto.arima (预测包)和ARIMA(1,0,1)模型的数据的最佳ARIMA。

(2) The intercept value for the ARIMA(1,0,0) model is 12260.298. (2)ARIMA(1,0,0)模型的截距值为12260.298。 Shouldn't the intercept satisfy the equation: C = mean * (1 - sum(AR coeffs)) , in which case, the value should be 715.52 . 截距不应满足等式: C = mean * (1 - sum(AR coeffs)) ,在这种情况下,该值应为715.52 I must be missing something basic here. 我必须遗漏一些基本的东西。

(3) This is clearly a series with non-stationary mean. (3)这显然是一个具有非平稳均值的系列。 Why is an AR(2) model still selected as the best model by auto.arima ? 为什么AR(2)模型仍被auto.arima选为最佳模型? Could there be an intuitive explanation? 可以有一个直观的解释吗?

Thanks. 谢谢。

  1. No ARIMA(p,0,q) model will allow for a trend because the model is stationary. 没有ARIMA(p,0,q)模型将允许趋势,因为模型是静止的。 If you really want to include a trend, use ARIMA(p,1,q) with a drift term, or ARIMA(p,2,q). 如果您真的想要包含趋势,请使用带有漂移项的ARIMA(p,1,q)或ARIMA(p,2,q)。 The fact that auto.arima() is suggesting 0 differences would usually indicate there is no clear trend. auto.arima()建议0差异这一事实通常表明没有明显的趋势。

  2. The help file for arima() shows that the intercept is actually the mean. arima()的帮助文件显示拦截实际上是平均值。 That is, the AR(1) model is (Y_t-c) = ϕ(Y_{t-1} - c) + e_t rather than Y_t = c + ϕY_{t-1} + e_t as you might expect. 也就是说,AR(1)模型是(Y_t-c) = ϕ(Y_{t-1} - c) + e_t而不是Y_t = c + ϕY_{t-1} + e_t正如您所料。

  3. auto.arima() uses a unit root test to determine the number of differences required. auto.arima()使用单位根测试来确定所需的差异数。 So check the results from the unit root test to see what's going on. 因此,请检查单位根测试的结果,看看发生了什么。 You can always specify the required number of differences in auto.arima() if you think the unit root tests are not leading to a sensible model. 如果您认为单位根测试未导致合理的模型,则始终可以在auto.arima()指定所需的差异数。

Here are the results from two tests for your data: 以下是两项数据测试的结果:

R> adf.test(x)

        Augmented Dickey-Fuller Test

data:  x 
Dickey-Fuller = -1.031, Lag order = 3, p-value = 0.9249
alternative hypothesis: stationary 

R> kpss.test(x)

        KPSS Test for Level Stationarity

data:  x 
KPSS Level = 0.3491, Truncation lag parameter = 1, p-value = 0.09909

So the ADF says strongly non-stationary (the null hypothesis in that case) while the KPSS doesn't quite reject stationarity (the null hypothesis for that test). 因此,ADF表示非常不稳定(在这种情况下为零假设),而KPSS并不完全拒绝平稳性(该测试的零假设)。 auto.arima() uses the latter by default. auto.arima()默认使用后者。 You could use auto.arima(x,test="adf") if you wanted the first test. 如果您想进行第一次测试auto.arima(x,test="adf")可以使用auto.arima(x,test="adf") In that case, it suggests the model ARIMA(0,2,1) which does have a trend. 在这种情况下,它建议ARIMA(0,2,1)模型确实有一个趋势。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM