简体   繁体   English

使用Auto.arima预测多元数据

[英]Forecasting multivariate data with Auto.arima

I am trying to forecasts sales of weekly data. 我正在尝试预测每周数据的销售量。 The data consists of these variables week no, sales, avgprice/perunit , holiday(whether that week contains holiday or not) and promotion(if any promotion is going) of 104 weeks. 数据由104周的星期,销售,平均价格/单位,假期(该周是否包含假期)和促销(如果有促销)组成。 So basically the last 6 obs of data set looks as: 因此,基本上,数据集的最后6个obs如下所示:

 Week     Sales       Avg.price.unit Holiday    Promotion

  101     8,970             50       0         1

  102    17,000             50       1         1

  103    23,000             80       1         0

  104    28,000            180       1         0

  105                      176       1         0

  106                      75        0         1

Now I want to forecast for 105th and 106th week. 现在我要预测第105周和第106周。 So I created univariate time series x by using ts function and then ran auto.arima function by issuing the command: 因此,我使用ts函数创建了单变量时间序列x,然后通过发出以下命令来运行auto.arima函数:

x<-ts(sales$Sales, frequency=7)
>  fit<-auto.arima(x,xreg=external, test=c("kpss","adf","pp"),seasonal.test=c("ocsb","ch"),allowdrift=TRUE)
>fit
ARIMA(1,1,1)                    

**Coefficients:
          ar1      ma1  Avg.price.unit   Holiday  Promotion

      -0.1497  -0.9180          0.0363  -10.4181    -4.8971

s.e.   0.1012   0.0338          0.0646    5.1999     5.5148

sigma^2 estimated as 479.3:  log likelihood=-465.09
AIC=942.17   AICc=943.05   BIC=957.98**

Now when I want to forecast the values for last 2 weeks(105th and 1o6th) I supply the external values of regressors for 105th and 106th week: 现在,当我要预测最近2周(第105和1o6)的值时,我提供第105和第106周的回归变量的外部值:

forecast(fit, xreg=ext)

where ext consists of future values of regressors for last 2 weeks.

The output comes as:

 Point         Forecast    Lo 80    Hi 80    Lo 95    Hi 95

15.85714       44.13430 16.07853 72.19008 1.226693 87.04191

16.00000       45.50166 17.38155 73.62177 2.495667 88.50765

The output looks incorrect since the forecasted value of sales is very less as the sales value of previous values(training) values are generallly in range of thousands. 输出看起来不正确,因为销售的预测值非常小,因为先前值(培训)的销售值通常在数千的范围内。

If anyone can tell me why it is coming incorrect/unexpected, that would be great. 如果有人能告诉我为什么它不正确/出乎意料,那就太好了。

If you knew a priori that certain weeks of the year or certain events in the year were possibly important you could form a Transfer Function that couild be useful. 如果您先验地知道一年中的某些星期或一年中的某些事件可能很重要,则可以形成一个有用的传递函数。 You might have to include some ARIMA structure to deal with short-term autoregressive structure AND/OR some Pulse/Level Shift/Local Trends to deal with unspecified deterministic series ( omitted variables ). 您可能必须包括一些ARIMA结构来处理短期自回归结构,和/或一些脉冲/电平转换/局部趋势来处理未指定的确定性序列(省略的变量)。 If you would like to post all of your data I would be glad to demonstrate that for you thus providing ground zero help. 如果您想发布所有数据,我将很高兴为您演示这一点,从而为您提供零基础的帮助。 Alternatively you can email it to me at dave@autobox.com and I will analyze it and post the data and the results to the list. 另外,您也可以通过dave@autobox.com通过电子邮件将其发送给我,我将对其进行分析并将数据和结果发布到列表中。 Other commentators on this question might also want to do the same for comparative analytics. 关于这个问题的其他评论者也可能希望对比较分析做同样的事情。

Where are the 51 weekly dummies in your model? 您的模型中的51个每周虚拟变量在哪里? Without them you have no way to capture seasonality. 没有它们,您将无法捕捉季节性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM