I am trying to fit ARIMAX model on train sample (endogenous and exogenous variables) and then forecast using exogenous variables (they are available). I am using statsmodels
module in Python.
I have the following code:
#split datasets
df_train = df.iloc[:100]
df_test= df.iloc[100:104]
# Define the model
model = ARIMA(endog= df_train['y'], exog=df_train[['x1', 'x2']], order=(2,0,2))
# Fit the model
results = model.fit()
#predict for the next 5 periods
results.predict(steps = 5, exog = df_test[['pc1', 'pc2']])
Unfortunately, seems it predicts in-sample fit using train dataset but not test dataset, because there are 100 prediction points.
If there is 2 lags in the model, so should I append last 2 points of y
from train dataset or should not ( results
somehow preserves information about last value of y
)?
Similar question I already found, however, they were related to R.
Use the forecast
method:
results.forecast(steps=5, exog=df_test[['pc1', 'pc2']])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.