简体   繁体   中英

statsmodels ARMA to predict out-of-sample

I want to predict the return of a time series, I first fitted the data set but it doesn't work when I come to predict the tomorrow's return. My code is

    date = datetime.datetime(2014,12,31)
    todayDate = (date).strftime('%Y-%m-%d')
    startdate = (date - timedelta(days = 1)).strftime('%Y-%m-%d') 
    enddate = (date + timedelta(days = 2)).strftime('%Y-%m-%d')         
    data = get_pricing([symbol],start_date= date1, end_date = todayDate, frequency='daily')
    df =  pd.DataFrame({"value": data.price.values.ravel()},index = data.major_axis.ravel())
    result = df.pct_change().dropna() 

    degree = {}
    for x in range(0,5):
        for y in range(0,5):
            try:
                arma = ARMA(result, (x,y)).fit()
                degree[str(x) +str(y)] = arma.aic

            except:
                continue

    dic= sorted(degree.iteritems(), key = lambda d:d[1])

    p = int(dic[0][0][0])
    q = int(dic[0][0][1])
    arma = ARMA(result, (p,q)).fit()
    predicts = arma.predict()
    exogx = np.array(range(1,4))
    predictofs = arma.predict(startdate,enddate, exogx)

The last line doesn't work and it produced an error

ValueError: Must provide freq argument if no data is supplied

I don't understand. Anyone had encountered the same issue?

I had the same issue it is because your index is missing the Freq argument. If you print data.index you will see that something like

DatetimeIndex(['2015-06-27', '2015-06-29', '2015-06-30', '2015-07-01', '2015-07-02', '2015-07-03', '2015-07-04', '2015-07-06', '2015-07-07', '2015-07-08', '2015-07-09', '2015-07-10', '2015-07-11', '2015-07-13', '2015-07-14', '2015-07-15', '2015-07-16', '2015-07-17', '2015-07-18', '2015-07-20', '2015-07-21', '2015-07-22', '2015-07-23', '2015-07-24', '2015-07-25', '2015-07-27', '2015-07-28', '2015-07-29', '2015-07-30', '2015-07-31'], dtype='datetime64[ns]', name=u'Date', freq=None)]

Note the 'Freq = None'

you can do something like :

data = Series(data.values, data.index)
data = data.asfreq('D')

You can also hard specify frequency by doing

data.index.freq = 'D'

Let me know if that helps a little.


If that does not work you can simply use the integer to do the prediction and then fill the index manualy

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM