I have this series hospitalization_diff
, with .head()
date
2020-10-16 347.0
2020-10-15 149.0
2020-10-14 530.0
2020-10-13 -489.0
2020-10-12 -859.0
Name: hospitalizedIncrease, dtype: float64
I want to forecast the time series using an ARIMA model (already tested for stationarity, differentiated, and optimized parameters). I got the code-bit from here .
# split into train-test set
size = int(len(X) * 0.75)
train, test = hospitalization_diff[:size], hospitalization_diff[size:]
# Build Model
model = ARIMA(train, order=(0, 0, 1))
fitted = model.fit(disp=-1)
# Forecast
fc, se, conf = fitted.forecast(len(test), alpha=0.05) # 95% conf
# Make as pandas series
fc_series = pd.Series(fc, index=test.index)
lower_series = pd.Series(conf[:, 0], index=test.index)
upper_series = pd.Series(conf[:, 1], index=test.index)
# Plot
plt.figure(figsize=(12,5), dpi=100)
plt.plot(train, label='training')
plt.plot(test, label='actual')
plt.plot(fc_series, label='forecast')
plt.fill_between(lower_series.index, lower_series, upper_series,
color='k', alpha=.15)
plt.title('Forecast vs Actuals')
plt.legend(loc='upper left', fontsize=8)
plt.show()
As an output, however, I get :
I don't understand why it's predicting the start of the series, what am I doing wrong?
Because your datasets (both train
and test
) were in reversed chronological order, which must be corrected at the very beginning.
# apply at the beginning of your code
hospitalization_diff.sort_index(inplace=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.