繁体   English   中英

R的auto.arima()等效于Python

[英]R's auto.arima() equivalent in Python

我想在python中实现Rauto.arima()函数的等价物。

在R auto.arima函数中,当输入计算ARIMA顺序参数(p,d,q值)并拟合模型时,需要时间序列值,不需要提供p,d,q值作为用户的输入。

我想在python中使用等效的auto.arima函数(不调用auto.arima R)来预测时间序列中的未来值。 在以下时间序列中执行auto.arima-python 40个点并预测接下来的6个值,然后将窗口移动1个点并再次执行相同的过程。

以下是示例数据:

value
0
2.584751
2.884758
2.646735
2.882105
3.267503
3.94552
4.70788
5.384803
54.77972
62.87139
78.68957
112.7166
155.0074
170.8084
196.1941
237.4928
254.9718
175.0717
217.3807
244.7357
274.4517
304.6838
373.3202
345.6252
461.2653
443.5982
472.3653
469.3326
506.8819
532.1639
542.2837
514.9269
528.0194
540.539
542.7031
556.8262
569.7132
576.2339
577.7212
577.0873
569.6199
573.2445
573.7825
589.3506

我尝试使用AD Fuller Test编写函数来计算差分的顺序,通过差分时间序列(在根据adfuller测试结果区分原始时间序列后变为静止)到arma顺序选择函数来计算P,Q顺序值。

进一步使用这些值传递给Statsmodels中的arima函数。 但功能似乎不起作用。

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import acf, pacf

def diff_terms(timeseries):
    i=1
    j=0
    while i != 0:
        dftest = adfuller(timeseries, autolag='AIC')
        if dftest[0] <= dftest[4]["5%"]:
            i = 0
        else:
            timeseries = np.diff(timeseries)
            i = 1
            j = j + 1
    return j

def p_q_values_estimator(timeseries):
    p=0
    q=0
    lag_acf = acf(timeseries, nlags=20)
    lag_pacf = pacf(timeseries, nlags=20, method='ols')
    y=1.96/np.sqrt(len(timeseries))

    if lag_acf[0] < y:
        for a in lag_acf:
            if a < y:
                q = q + 1
                break 
    elif lag_acf[0] > y:
        for c in lag_acf:
            if c > y:
                q = q + 1
                break

    if lag_pacf[0] < y:
        for b in lag_pacf:
            if b < y:
                p = p + 1
                break
    elif lag_pacf[0] > y:
        for d in lag_pacf:
            if d > y:
                p = p + 1
                break

    p_q=[p,q]
    return(p_q)

def p_q_values_estimator2(timeseries):
    res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc')
    return res.aic_min_order

data1=[]
data=pd.read_csv('ABC.csv')
d_value=diff_terms(data.value)
data1[:]=data[:]
data = data[0:40]

i=0
while i < d_value:
    data_diff = np.diff(data)
    i = i+1

p_q_values=p_q_values_estimator(data)
p_value=p_q_values[0]
q_value=p_q_values[1]

p_q_values2=p_q_values_estimator2(data_diff)
p_value2=p_q_values2[0]
q_value2=p_q_values2[1]


exogx = np.array(range(0,40))
fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit()
print(fit2.fittedvalues)
pred2 = fit2.predict(start = 40, end = 45, exog = np.array(range(40,46)))
print(pred2)
plt.plot(fit2.fittedvalues)
plt.plot(np.array(data))
plt.plot(range(40,45), np.array(pred2))
plt.show()

错误 - 使用arma order select

p_q_values2=p_q_values_estimator2(data_diff)
line 56, in p_q_values_estimator2
res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc')
File "C:\Python27\lib\site-packages\statsmodels\tsa\stattools.py", line 1052, in arma_order_select_ic min_res.update({i + '_min_order' : (mins[0][0], mins[1][0])})
IndexError: index 0 is out of bounds for axis 0 with size 0

错误 - 使用基于acf pacf的函数计算P,Q顺序

fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit()
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 1104, in fit
callback, **kwargs)
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 942, in fit
armafit.mle_retvals = mlefit.mle_retvals
AttributeError: 'LikelihoodModelResults' object has no attribute 'mle_retvals'

Vals是我自己的事情,但您可以使用pd.date_range创建自己的索引

rdata=ts(traindf.requests_per_active.values,frequency=12)
#forecasts
fit=forecast.auto_arima(rdata)
forecast_output=forecast.forecast(fit,h=6,level=(95.0))
#convert forecasts to dataframe     
forecast_results=pd.Series(forecast_output[3], index=vals.index)
lowerpi=pd.Series(forecast_output[4], index=vals.index)
upperpi=pd.Series(forecast_output[5], index=vals.index)
results = pd.DataFrame({'forecast' : forecast_results, 'lowerpi' : lowerpi, 'upperpi' : upperpi})

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM