简体   繁体   English

ARIMA 预测使用新的 python statsmodels 给出不同的结果

[英]ARIMA forecast gives different results with new python statsmodels

I'm (out-of-sample) forecasting with ARIMA(0,1,0).我正在使用 ARIMA(0,1,0) 进行(样本外)预测。

In python's statsmodels latest stable version 0.12.在 python 的 statsmodels 最新稳定版本 0.12。 I calculate:我计算:

import statsmodels.tsa.arima_model as stats

time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05

model = stats.ARIMA(time_series, order=(0, 1, 0))
model_fit = model.fit(disp=0)

forecast, _, intervals = model_fit.forecast(steps=steps, exog=None, alpha=alpha)

which results in这导致

forecast = [21.125, 23.25, 25.375, 27.5]
intervals = [[19.5950036, 22.6549964 ], [21.08625835, 25.41374165], [22.72496851, 28.02503149], [24.44000721, 30.55999279]]

and a Future Warning, which suggests:和未来警告,它建议:

FutureWarning: 
statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have
been deprecated in favor of statsmodels.tsa.arima.model.ARIMA (note the .
between arima and model) and
statsmodels.tsa.SARIMAX. These will be removed after the 0.12 release.

In the new version, as hinted to in the Future Warning, I calculate:在新版本中,正如未来警告中所暗示的那样,我计算:

import statsmodels.tsa.arima.model as stats

time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05

model = stats.ARIMA(time_series, order=(0, 1, 0))
model_fit = model.fit()

forecast = model_fit.get_forecast(steps=steps)
forecasts_and_intervals = forecast.summary_frame(alpha=alpha)

which gives different results:这给出了不同的结果:

forecasts_and_intervals =
y  mean   mean_se  mean_ci_lower  mean_ci_upper
0  19.0  2.263842      14.562951      23.437049
1  19.0  3.201556      12.725066      25.274934
2  19.0  3.921089      11.314806      26.685194
3  19.0  4.527684      10.125903      27.874097

I would like to obtain the same results as before.我想获得与以前相同的结果。 Am I using the new interface correctly?我是否正确使用了新界面?

I need both the forecast and the intervals.我需要预测和间隔。 I tried already to use different functions as just forecast the new interface offers.我已经尝试使用不同的功能来forecast新界面提供的功能。

In particular I'm wondering why the forecast result is 19 for the entire list.特别是我想知道为什么整个列表的预测结果是 19。

Many thanks for every help.非常感谢您的每一次帮助。

Here is the documentation for statsmodels 0.12.2: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.html?highlight=arima#statsmodels.tsa.arima_model.ARIMA这是 statsmodels 0.12.2 的文档: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.html?highlight=arima#statsARIMAmodels.t。

Here is the documentation for newer version of Arima: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html?highlight=arima#statsmodels.tsa.arima.model.ARIMA Here is the documentation for newer version of Arima: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html?highlight=arima#statsmodels.tsa.arima.model.ARIMA

The difference is due to whether the models include a "constant" term or not.不同之处在于模型是否包含“常数”项。 For the first case ie older statsmodels.tsa.arima_model.ARIMA , it automatically includes a constant term (and no option to turn on/off).对于第一种情况,即较旧的statsmodels.tsa.arima_model.ARIMA ,它会自动包含一个常数项(并且没有打开/关闭选项)。 If you have a differencing, it also includes it but does so in the differenced domain (otherwise it would be eliminated anyway).如果您有差异,它也包含它,但在差异域中这样做(否则无论如何它都会被消除)。 So here is its ARIMA(0, 1, 0) model:所以这里是它的 ARIMA(0, 1, 0) model:

y_t - y_{t-1} = c + e_t

which is "random walk with drift".这是“随漂随走”。

For the new statsmodels.tsa.arima.model.ARIMA , as the documentation you linked says, not any kind of trend term (including constant, ie c ) is included when differencing is involved, which is the case for you.对于新的statsmodels.tsa.arima.model.ARIMA ,正如您链接的文档所述,在涉及差异时不包括任何类型的趋势项(包括常数,即c ),这就是您的情况。 So here is its ARIMA(0, 1, 0) model:所以这里是它的 ARIMA(0, 1, 0) model:

y_t - y_{t-1} = e_t

which is "random walk" and as we know, forecasts from it corresponds to naive forecasts ie repeating the last value (19 in your case).这是“随机游走”,正如我们所知,它的预测对应于幼稚的预测,即重复最后一个值(在您的情况下为 19)。

Then, what to do to make the new one work?那么,怎样做才能让新的工作正常呢?

It includes a parameter called trend which you can specify to get the same behaviour.它包括一个名为trend的参数,您可以指定它来获得相同的行为。 Since you are using a differencing (d=1), passing trend="t" should give the same model as the old one.由于您使用的是差分 (d=1),因此通过trend="t"应该给出与旧的相同的 model 。 ( "t" means linear trend but since d = 1 , it will reduce to a constant in the differenced domain): "t"表示线性趋势,但由于d = 1 ,它将在差分域中减少为常数):

import statsmodels.tsa.arima.model as stats

time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05

model = stats.ARIMA(time_series, order=(0, 1, 0), trend="t")   # only change is here!
model_fit = model.fit()

forecast = model_fit.get_forecast(steps=steps)
forecasts_and_intervals = forecast.summary_frame(alpha=alpha)

and here is what I get for forecasts_and_intervals :这是我得到的forecasts_and_intervals

y       mean   mean_se  mean_ci_lower  mean_ci_upper
0  21.124995  0.780622      19.595004      22.654986
1  23.249990  1.103966      21.086256      25.413724
2  25.374985  1.352077      22.724962      28.025008
3  27.499980  1.561244      24.439997      30.559963

I think this raises another issue.我认为这引发了另一个问题。 I'm not sure exogenous variables are treated the same in the new arima.model version.我不确定在新的 arima.model 版本中外生变量的处理方式相同。 I believe in the old version, arima_model, they are applied to the order of differences.我相信在旧版本 arima_model 中,它们适用于差异的顺序。 For (0,0,0) Y=mx+b or if (0,1,0), then dy=mx+b.对于 (0,0,0) Y=mx+b 或如果 (0,1,0),则 dy=mx+b。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM