简体   繁体   English

statsmodels ARIMA 模型中的 LU 分解错误

[英]LU decomposition error in statsmodels ARIMA model

I know there is a very similar question and answer on stackoverflow ( here ), but this seems to be distinctly different.我知道 stackoverflow 上有一个非常相似的问题和答案( here ),但这似乎明显不同。 I am using statsmodels v 0.13.2, and I am using an ARIMA model as opposed to a SARIMAX model.我正在使用 statsmodels v 0.13.2,并且我使用的是 ARIMA 模型而不是 SARIMAX 模型。

I am trying to fit a list of time series data sets with an ARIMA model.我正在尝试使用 ARIMA 模型拟合时间序列数据集列表。 The offending piece of my code is here:我的代码中有问题的部分在这里:

import numpy as np
from statsmodels.tsa.arima.model import ARIMA

items = np.log(og_items)
items['count'] = items['count'].apply(lambda x: 0 if math.isnan(x) or math.isinf(x) else x)
model = ARIMA(items, order=(14, 0, 7))
trained = model.fit()

items is a dataframe containing a date index and a single column, count . items是一个包含日期索引和单列count的数据框。

I apply the lambda on the second line because some counts can be 0, resulting in a negative infinity after log is applied.我在第二行应用了 lambda,因为一些计数可能为 0,导致应用 log 后的负无穷大。 The final product going into the ARIMA does not contain any NaNs or Infinite numbers.进入 ARIMA 的最终产品不包含任何 NaN 或无限数。 However, when I try this without using the log function, I do not get the error.但是,当我在不使用日志功能的情况下尝试此操作时,我没有收到错误消息。 This only occurs on certain series, but there does not seem to be rhyme or reason to which are affected.这仅发生在某些系列上,但似乎没有韵律或原因受到影响。 One series had about half of its values as zero after applying the lambda, while another did not have a single zero.一个系列在应用 lambda 后大约有一半的值为零,而另一个系列没有一个零。 Here is the error:这是错误:

Traceback (most recent call last):
  File "item_pipeline.py", line 267, in <module>
    main()
  File "item_pipeline.py", line 234, in main
    restaurant_predictions = make_predictions(restaurant_data=restaurant_data, models=models,
  File "item_pipeline.py", line 138, in make_predictions
    predictions = model(*data_tuple[:2], min_date=min_date, max_date=max_date,
  File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 127, in predict_daily_arima
    predict_date_arima(prediction_dict, item_dict, prediction_date, x_days_out=x_days_out, log_vals=log_vals,
  File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 51, in predict_date_arima
    raise e
  File "/Users/rob/Projects/5out-ml/models/item_level/items/predict_arima.py", line 47, in predict_date_arima
    fitted = model.fit()
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/arima/model.py", line 390, in fit
    res = super().fit(
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 704, in fit
    mlefit = super(MLEModel, self).fit(start_params, method=method,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/model.py", line 563, in fit
    xopt, retvals, optim_settings = optimizer._fit(f, score, start_params,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/optimizer.py", line 241, in _fit
    xopt, retvals = func(objective, gradient, start_params, fargs, kwargs,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/optimizer.py", line 651, in _fit_lbfgs
    retvals = optimize.fmin_l_bfgs_b(func, start_params, maxiter=maxiter,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_lbfgsb_py.py", line 199, in fmin_l_bfgs_b
    res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_lbfgsb_py.py", line 362, in _minimize_lbfgsb
    f, g = func_and_grad(x)
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 286, in fun_and_grad
    self._update_grad()
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 256, in _update_grad
    self._update_grad_impl()
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 173, in update_grad
    self.g = approx_derivative(fun_wrapped, self.x, f0=self.f,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 505, in approx_derivative
    return _dense_difference(fun_wrapped, x0, f0, h,
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 576, in _dense_difference
    df = fun(x) - f0
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_numdiff.py", line 456, in fun_wrapped
    f = np.atleast_1d(fun(x, *args, **kwargs))
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 137, in fun_wrapped
    fx = fun(np.copy(x), *args)
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/base/model.py", line 531, in f
    return -self.loglike(params, *args) / nobs
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 939, in loglike
    loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 983, in loglike
    kfilter = self._filter(**kwargs)
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 903, in _filter
    self._initialize_state(prefix=prefix, complex_step=complex_step)
  File "/Users/rob/Projects/5out-ml/venv/lib/python3.8/site-packages/statsmodels/tsa/statespace/representation.py", line 983, in _initialize_state
    self._statespaces[prefix].initialize(self.initialization,
  File "statsmodels/tsa/statespace/_representation.pyx", line 1362, in statsmodels.tsa.statespace._representation.dStatespace.initialize
  File "statsmodels/tsa/statespace/_initialization.pyx", line 288, in statsmodels.tsa.statespace._initialization.dInitialization.initialize
  File "statsmodels/tsa/statespace/_initialization.pyx", line 406, in statsmodels.tsa.statespace._initialization.dInitialization.initialize_stationary_stationary_cov
  File "statsmodels/tsa/statespace/_tools.pyx", line 1206, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.LinAlgError: LU decomposition error.

The solution in the other stackoverflow post was to initialize the statespace differently.另一个stackoverflow帖子中的解决方案是以不同的方式初始化状态空间。 It looks like the statespace is involved, if you look at the last few lines of the error.如果您查看错误的最后几行,则看起来涉及状态空间。 However, it does not seem that that workflow is exposed in the newer version of statsmodels.但是,该工作流程似乎并未在较新版本的 statsmodels 中公开。 Is it?是吗? If not, what else can I try to circumvent this error?如果没有,我还能尝试什么来规避这个错误?

So far, I have tried manually initializing the model to approximate diffuse , and manually setting the initialize property to approximate diffuse .到目前为止,我已经尝试手动将模型初始化为approximate diffuse ,并手动将initialize属性设置为approximate diffuse Neither seem to be valid in the new statsmodels code.在新的 statsmodels 代码中似乎都不是有效的。

Turns out there's a new way to initialize.原来有一种新的初始化方法。 The second line below is the operative line.下面的第二行是手术行。

model = ARIMA(items, order=(14, 0, 7))
model.initialize_approximate_diffuse() # this line
trained = model.fit()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM