[英]Do we need to do differencing of exogenous variables before passing to exog argument of SARIMAX() from statsmodels in Python?
I am trying to build a forecasting model using SARIMAX in Python (regression with SARIMA errors) and require some guidance on how exogenous variables are handled in exog argument.我正在尝试在 Python 中使用 SARIMAX 构建预测 model(带有 SARIMA 错误的回归),并且需要一些关于如何在 exog 参数中处理外生变量的指导。
The default parameters is:默认参数为:
SARIMAX(endog, exog=None, order=(1, 0, 0), seasonal_order=(0, 0, 0, 0), trend=None, measurement_error=False,
time_varying_regression=False, mle_regression=True, simple_differencing=False, enforce_stationarity=True,
enforce_invertibility=True, hamilton_representation=False, concentrate_scale=False, trend_offset=1,
use_exact_diffuse=False, dates=None, freq=None, missing='none', validate_specification=True, **kwargs)
This is how I fitted my model:这就是我安装 model 的方式:
*Before I pass endog and exog to the SARIMAX function I did not transform the variables. *在我将 endog 和 exog 传递给 SARIMAX function 之前,我没有转换变量。
SARIMAX(endog, exog=exog['TMIN_IAC'], order= (0,1,1), seasonal_order= (0,0,0,0), trend='c')
And this is the resultant summary:这是结果摘要:
SARIMAX Results
==============================================================================
Dep. Variable: all No. Observations: 151
Model: SARIMAX(0, 1, 1) Log Likelihood -624.229
Date: Mon, 05 Apr 2021 AIC 1256.457
Time: 14:36:48 BIC 1268.500
Sample: 01-31-2001 HQIC 1261.350
- 07-31-2013
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.2139 0.071 2.996 0.003 0.074 0.354
TMIN_IAC -6.1222 0.474 -12.920 0.000 -7.051 -5.193
ma.L1 -0.9504 0.029 -33.060 0.000 -1.007 -0.894
sigma2 237.3801 33.036 7.185 0.000 172.631 302.130
===================================================================================
Ljung-Box (L1) (Q): 0.25 Jarque-Bera (JB): 2.21
Prob(Q): 0.62 Prob(JB): 0.33
Heteroskedasticity (H): 1.26 Skew: -0.08
Prob(H) (two-sided): 0.42 Kurtosis: 2.43
===================================================================================
I did a search in the documentation , but the closest thing of my question they cite about is this:我在文档中进行了搜索,但他们引用的最接近我的问题的是:
If simple_differencing = True is used, then the endog and exog data are differenced prior to putting the model in state-space form.
如果使用simple_differencing = True,则在将 model 置于状态空间形式之前,会区分 endog 和 exog 数据。 This has the same effect as if the user differenced the data prior to constructing the model, which has implications for using the results
这与用户在构建 model 之前区分数据的效果相同,这对使用结果有影响
My concern is because according to Alan Pankratz, in his book Forecasting With Dynamic Regression Models (1991), if differencing is applied to the errors in a multiple regression both of the dependent and the explanatory variables should be differenced, and I am not certain Statsmodels do that automatically.我担心的是,根据 Alan Pankratz 在他的《使用动态回归模型进行预测》 (1991 年)一书中的说法,如果对多元回归中的误差应用差分,则因变量和解释变量都应该是不同的,我不确定 Statsmodels自动执行此操作。
It seems SARIMAX
from statsmodels also difference both, the response and the exog variables, automatically.似乎来自
SARIMAX
的 SARIMAX 也会自动区分响应和 exog 变量。
According to Rob Hyndman, author of Arima
function in the forecast
package in R:根据 Rob Hyndman 的说法,
Arima
function 在 R 中的forecast
package 的作者:
Arima
will difference both the response variable and the xreg variables as specified in the order and seasonal arguments.Arima
将区分订单中指定的响应变量和 xreg 变量以及季节性 arguments。 You should never need to do the differencing yourself.您永远不需要自己进行差异化。
So I ran the same model in R and acquired the same results:所以我在 R 中运行了相同的 model 并获得了相同的结果:
Arima(endog, order = c(0,1,1),seasonal = c(0,0,0), xreg = exog, include.drift = TRUE,
lambda = NULL, method = 'ML')
Model summary: Model总结:
Regression with ARIMA(0,1,1) errors
Coefficients:
ma1 drift TMIN_IAC
-0.9504 0.2139 -6.1219
s.e. 0.0381 0.0724 0.4763
sigma^2 estimated as 242.2: log likelihood=-624.23
AIC=1256.47 AICc=1256.74 BIC=1268.51
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.