简体   繁体   English

时间序列数据的平稳性

[英]Stationarity of a time series data

I am trying to model a time series data using ARIMA modelling in python. 我正在尝试在python中使用ARIMA建模来建模时间序列数据。 I used the function statsmodels.tsa.stattools.arma_order_select_ic on the default data series and got the values of p and q as 2,2 respectively. 我在默认数据系列上使用了statsmodels.tsa.stattools.arma_order_select_ic函数, statsmodels.tsa.stattools.arma_order_select_ic p和q的值分别设为2,2。 The code is as below, 代码如下

dates=pd.date_range('2010-11-1','2011-01-30')
dataseries=Series([22,624,634,774,726,752,38,534,722,678,750,690,686,26,708,606,632,632,632,584,28,576,474,536,512,464,436,24,448,408,528,
          602,638,640,26,658,548,620,534,422,482,26,616,612,622,598,614,614,24,644,506,522,622,526,26,22,738,582,592,408,466,568,
          44,680,652,598,642,714,562,38,778,796,742,460,610,42,38,732,650,670,618,574,42,22,610,456,22,630,408,390,24],index=dates)
df=pd.DataFrame({'Consumption':dataseries})
df

sm.tsa.arma_order_select_ic(df, max_ar=4, max_ma=2, ic='aic')

The Result is as follow, 结果如下,

{'aic':              0            1            2
 0  1262.244974  1264.052640  1264.601342
 1  1264.098325  1261.705513  1265.604662
 2  1264.743786  1265.015529  1246.347400
 3  1265.427440  1266.378709  1266.430373
 4  1266.358895  1267.674168          NaN, 'aic_min_order': (2, 2)}

But when I use Augumented Dickey Fuller test, the test result shows that the series is not stationary. 但是,当我使用Augumented Dickey Fuller测试时,测试结果表明该系列不是平稳的。

d_order0=sm.tsa.adfuller(dataseries)
print 'adf: ', d_order0[0] 
print 'p-value: ', d_order0[1]
print'Critical values: ', d_order0[4]

if d_order0[0]> d_order0[4]['5%']: 
    print 'Time Series is  nonstationary'
    print d
else:
    print 'Time Series is stationary'
    print d

Output is as follow, 输出如下

adf:  -1.96448506629
p-value:  0.302358888762
Critical values:  {'5%': -2.8970475206326833, '1%': -3.5117123057187376, '10%': -2.5857126912469153}
Time Series is  nonstationary
1

When I cross verified the results with R, it showed that the default series is stationary. 当我与R交叉验证结果时,它表明默认序列是固定的。 Then why did the augumented dickey fuller test result in non stationary series? 那么为什么增加的dickey fuller测试结果是在非平稳序列中得出的呢?

Clearly you have some seasonality in your data. 显然,您的数据具有一定的季节性。 Then arma models and stationarity tests need to be carefully done. 然后,需要仔细进行军备模型和平稳性测试。

Apparently, the reason for the difference in adf test between python and R is the number of default lags each software uses. 显然,python和R之间的adf测试不同的原因是每种软件使用的默认延迟数。

> (nobs=length(dataseries))
[1] 91
> 12*(nobs/100)^(1/4)  #python default
[1] 11.72038
> trunc((nobs-1)^(1/3)) #R default
[1] 4
> acf(coredata(dataseries),plot = F)

Autocorrelations of series ‘coredata(dataseries)’, by lag

     0      1      2      3      4      5      6      7      8      9     10     11 
 1.000  0.039 -0.116 -0.124 -0.094 -0.148  0.083  0.645 -0.072 -0.135 -0.138 -0.146 
    12     13     14     15     16     17     18     19 
-0.185  0.066  0.502 -0.097 -0.151 -0.165 -0.195 -0.160 
> adf.test(dataseries,k=12)

    Augmented Dickey-Fuller Test

data:  dataseries
Dickey-Fuller = -2.6172, Lag order = 12, p-value = 0.322
alternative hypothesis: stationary

> adf.test(dataseries,k=4)

    Augmented Dickey-Fuller Test

data:  dataseries
Dickey-Fuller = -6.276, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

Warning message:
In adf.test(dataseries, k = 4) : p-value smaller than printed p-value
> adf.test(dataseries,k=7)

    Augmented Dickey-Fuller Test

data:  dataseries
Dickey-Fuller = -2.2571, Lag order = 7, p-value = 0.4703
alternative hypothesis: stationary

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM