简体   繁体   中英

How to properly set start/end params of statsmodels.predict function

I am doing a forecasting:

# Import the ARIMA module from statsmodels
from statsmodels.tsa.arima_model import ARIMA

# Forecast interest rates using an AR(1) model
mod = ARIMA(data, order=(1,1,1))
res = mod.fit()

# Plot the original series and the forecasted series
res.plot_predict(start='2014-07-02', end='2018-09-28')
plt.show()

I got an error:

KeyError: "invalid literal for int() with base 10: '2014-07-02'"

after reading statsmodels document: https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima_model.ARIMAResults.plot_predict.html
Then, the intuitive way is to check the type of '2014-07-02', it is pandas.core.indexes.datetimes.DatetimeIndex.
Thus, according to the document, datetime should be allowed. that's why I am confusing.

I followed Martijn Pieters's comment that the material question here is the index, the model doesn't have full dates as key, as it is Australian stock index:

            All Ordinaries closing price
Date    
2014-06-30  5382.0
2014-07-01  5366.5
2014-07-02  5441.7
2014-07-03  5479.5
2014-07-04  5511.8
2014-07-07  5506.3
2014-07-08  5498.5
2014-07-09  5442.2
2014-07-10  5454.3
2014-07-11  5474.6

Thus, some dates differ one day, some dates differ three days. However, I still don't understand why I cannot use res.plot_predict directly. Some others may have the same problem, as If I use a continuous time series, then it works.

Kriss provides a link under the comment, then I read it throughly, but I failed to use it to solve my problem: In my data, every date is unique, but to make sure this point, I followed the answer:

data = data.groupby(pd.TimeGrouper(freq='D')).sum()


# Import the ARIMA module from statsmodels
from statsmodels.tsa.arima_model import ARIMA
from datetime import datetime


# Forecast interest rates using an AR(1) model
mod = ARIMA(data, order=(1,1,1))
res = mod.fit()

# Plot the original series and the forecasted series
res.plot_predict(start=min(data.index), end=datetime(2018,9,28))
plt.show()

Then, I have the same feeling that I want to hit the wall,I got the error:

KeyError: Timestamp('2014-06-30 00:00:00')

The problem can be solved by using:

# Plot the original series and the forecasted series
res.plot_predict(start=datetime(2014,7,1), end=datetime(2018,9,28))
plt.show()

I can't use the first date, as I used first difference

您正在尝试将连字符(-)转换为整数,这对于int()是不可能完成的任务

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM