简体   繁体   中英

auto_arima crashing with single value change on series

I'm working on a time series forescast model with pmdarima .

My time series is short, but not so bad behaved. The following code gives an error on sklearn\\utils\\validation.py

from pmdarima import auto_arima
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller
import datetime
import pandas as pd

datelist = pd.date_range('2018-01-01', periods=24, freq='MS')

sales = [26.000000,27.100000,26.000000,28.014286,28.057143,
         30.128571,39.800000,33.000000,37.971429,45.914286,
         37.942857,33.885714,36.285714,34.971429,40.042857,
         27.157143,30.685714,35.585714,43.400000,51.357143,
         45.628571,49.942857,42.028571,52.714286]


df = pd.DataFrame(data=sales,index=datelist,columns=['sales'])

observations = df['sales']
size = df['sales'].size
shape = df['sales'].shape
maxdate = max(df.index).strftime("%Y-%m-%d")
mindate = min(df.index).strftime("%Y-%m-%d")


asc = seasonal_decompose(df, model='add')

if asc.seasonal[asc.seasonal.notnull()].size == df['sales'].size:
    seasonality = True
else:
    seasonality = False

# Check Stationarity
aftest = adfuller(df['sales'])

if aftest[1] <= 0.05:
    stationarity = True
else:
    stationarity = False

results = auto_arima(observations,
                     seasonal=seasonality,
                     stationary=stationarity,
                     m=12,
                     error_action="ignore")
~\AppData\Roaming\Python\Python37\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    584                              " minimum of %d is required%s."
    585                              % (n_samples, array.shape, ensure_min_samples,
--> 586                                 context))
    587 
    588     if ensure_min_features > 0 and array.ndim == 2:

ValueError: Found array with 0 sample(s) (shape=(0,)) while a minimum of 1 is required.

However, if I change the first value of the sales series from 26 to 30 it works.

What could be wrong here?

  1. Your example is not reproducible as currently seasonality and stationarity are not defined in the global scope. That leads to auto_arima throwing an error of the form

    NameError: name 'seasonality' is not defined

  2. You have only few observations, so try explicitly setting the min/max order values for the different ARIMA processes. IMO, this is generally good practice. In your case we can do

    fit = auto_arima( observations, start_p = 0, start_q = 0, start_P = 0, start_Q = 0, max_p = 3, max_q = 3, max_P = 3, max_Q = 3, D = 1, max_D = 2, m = 12, seasonal = True, error_action = 'ignore')

    Here we consider processes up to MA(3) and AR(3), as well as SMA(3) and SAR(3).

  3. Let's visualise the original time series data including the forecast

    n_ahead = 10 preds, conf_int = fit.predict(n_periods = n_ahead, return_conf_int = True) xrange = pd.date_range(min(datelist), periods = 24 + n_ahead, freq = 'MS') import matplotlib.pyplot as plt import matplotlib.dates as dates fig = plt.figure() plt.plot(xrange[:df.shape[0]], df["sales"]) plt.plot(xrange[df.shape[0]:], preds) plt.fill_between( xrange[df.shape[0]:], conf_int[:, 0], conf_int[:, 1], alpha = 0.1, color = 'b') plt.show()

    在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM