简体   繁体   English

Statsmodels ARIMA:每个预测的恒定值

[英]Statsmodels ARIMA: Constant Value for Each Forecast

I'm trying to use statsmodels' ARIMA to forecast a time series.我正在尝试使用 statsmodels 的 ARIMA 来预测时间序列。 I'm using sklearn's TimeSeriesSplit to evaluate my models.我正在使用 sklearn 的TimeSeriesSplit来评估我的模型。 Unfortunately, when I forecast the next fold of data (which has true value Y_test ), I get a constant prediction:不幸的是,当我预测下一个数据折叠(具有真实值Y_test )时,我得到了一个恒定的预测:

if is_arima:
      Y_train = Y_train.astype(float)
      # build basic ARIMA model
      arima_model = ARIMA(Y_train, order=(2,0,1))
      # fit it, using exogenous variables
      arima_results = arima_model.fit()
      # predict next len(test) values, using exogenous variables (X_test)
      preds = arima_results.forecast(steps=len(Y_test))[0]
      print(preds)

Which gives me:这给了我:

115.65096239  120.89113477  121.52020239  121.59572014  121.60478583
  121.60587414  121.60600479  121.60602047  121.60602235  121.60602258
  121.6060226   121.60602261  121.60602261  121.60602261  121.60602261
  121.60602261  121.60602261  121.6060226   121.6060226   121.6060226
  121.6060226   121.6060226   121.6060226   121.6060226   121.6060226
  121.6060226   121.6060226   121.6060226   121.6060226   121.6060226...

This makes me think my ARIMA isn't using the prediction at time t for its prediction at time t+1?这让我觉得我的 ARIMA 没有使用时间 t 的预测来预测时间 t+1?

I understand the output isn't perfectly constant but my dataset shows large variation, so this is mildly concerning.我知道输出不是完全恒定的,但我的数据集显示出很大的变化,所以这有点令人担忧。 Any idea what's going on?知道发生了什么吗?

Thanks!谢谢!

Your using ARIMA(2,0,1), so your prediction is您使用 ARIMA(2,0,1),所以您的预测是

x(t) = constant + w(t) + a1 * x(t-1) + a2 * x(t-2) + b1 * w(t-1)

So, your prediction depends on 2 factors.因此,您的预测取决于两个因素。 You have your autoregressive terms and your moving average term.您有自回归项和移动平均项。 Your autoregressive terms are just a constant times the prior period's value plus a different constant times the value 2 periods ago.您的自回归项只是前一周期值的常数乘以 2 周期前的值加上不同的常数乘以。 Then you have a moving average term, which is a constant times the error from the prior period's prediction.然后你有一个移动平均项,它是前一期预测误差的常数倍。 So your model is probably mostly dominated by the prior 2 periods, and that it probably finds an equilibrium rather quickly.因此,您的模型可能主要由前两个时期主导,并且它可能会很快找到平衡。

Try printing out the parameters and then plugging it into excel to see what is happening in the model.尝试打印出参数,然后将其插入 excel 以查看模型中发生的情况。

print(arima_model.summary())
print(arima_model.params)

You are making use of recursive strategy to do multi step prediction ie forecasts generated in the prior steps are used for the prediction of next forecasts iteratively.您正在使用递归策略进行多步预测,即在先前步骤中生成的预测用于迭代预测下一个预测。 It leads to error accumulation and as a result forecasting converges to a value.它导致错误累积,结果预测收敛到一个值。 Arima does not perform well for very long data series.对于很长的数据系列,Arima 表现不佳。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM