简体   繁体   English

为什么我在 adfuller 测试中得到 p 值 0.00000?

[英]Why am I getting p-value 0.00000 in adfuller test?

I am working with ARIMA.我正在与 ARIMA 合作。 To make the data stationary I have transformed the data using log and then subtracted the values by using shift.为了使数据静止,我使用 log 转换了数据,然后使用 shift 减去了这些值。 When I tested again with a rolling mean and adfuller test.当我再次使用滚动平均值和更丰富的测试进行测试时。 I am getting p-value to be 0.0000 why is it so?我将 p 值设为 0.0000 为什么会这样?

My code:我的代码:

import numpy as np 
import pandas as pd 
from statsmodels.tsa.stattools import adfuller
import matplotlib.pyplot as plt
df =
Date        open         high         low        close       adjclose    Volume
2010-06-30  5.158000    6.084000    4.660000    4.766000    4.766000    85935500
2010-07-01  5.000000    5.184000    4.054000    4.392000    4.392000    41094000
df['Date']=pd.to_datetime(df['Date'], infer_datetime_format=True)
df=df.set_index(['Date'])
def test_ad(values):
    mvm = values.rolling(window=12).mean()
    mvstd = values.rolling(window=12).std()
    orig = plt.plot(values,color='blue',label='org')
    mean = plt.plot(mvm,color='red',label='mvm')
    std=plt.plot(mvstd,color='black',label='mvstd')
    plt.legend(loc='best')
    plt.show(block=False)
    result=adfuller(values)
    print('ADF Statistic: %f' % result[0])
    print('p-value: %f' % result[1])
    print('Critical Values:')
    #labels = ['ADF Test Statistic','p-value','#Lags Used','Number of Observations Used']
    for key, value in result[4].items():
        print('\t%s: %.3f' % (key, value))
    if result[1] <= 0.05:
        print("Data is stationary")
    else:
        print("non-stationary ")

test_ad(df['Close'])

which gives:这使:

ADF Statistic: 6.450459
p-value: 1.000000
Critical Values:
    1%: -3.433
    5%: -2.863
    10%: -2.567


df['log']=np.log(df["Close"])
df['close']=df['log']-df['log'].shift()
#df['close']=df['log'].diff()
test_ad(df['close'].dropna())

Which gives这使

ADF Statistic: -50.361617
    p-value: 0.000000
    Critical Values:
        1%: -3.433
        5%: -2.863
        10%: -2.567

The graph looks stationary and also the critical values got satisfied as you can see above.如上所示,该图看起来是静止的,并且临界值也得到了满足。

You can see yourself that your ADF statistic is MUCH less than the critical value for 1%, therefore your p is just extremely small.您可以看到自己的 ADF 统计量远小于 1% 的临界值,因此您的 p 非常小。

What makes it confusing is that you are using %f to print out this value, which by default (ie without specifying the precision such as %.2f to include 2 decimals or %.10f to include 10 decimals) only includes 6 decimals after the point.令人困惑的是,您使用%f打印出这个值,默认情况下(即没有指定精度,例如%.2f包含 2 个小数或%.10f包含 10 个小数)仅在观点。

If you were to print the values in their entirety (such as print('p-value: %s' % result[1]) where you treat your p-value as a string (thus no need to specify precision), or in an f-string print(f'p-value: {result[1]}') ), you would see that you p-value is actually above 0 (although still very small).如果您要完整打印这些值(例如print('p-value: %s' % result[1]) ,您将 p 值视为字符串(因此无需指定精度),或者f-string print(f'p-value: {result[1]}') ),您会看到您的 p-value 实际上高于 0(尽管仍然非常小)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM