简体   繁体   English

ARIMA在statsmodels中没有样本预测?

[英]ARIMA out of sample prediction in statsmodels?

I have a timeseries forecasting problem that I am using the statsmodels python package to address. 我有一个时间序列预测问题,我正在使用statsmodels python包来解决。 Evaluating using the AIC criteria, the optimal model turns out to be quite complex, something like ARIMA(27,1,8) [ I haven't done an exhaustive search of the parameter space, but it seems to be at a minima around there]. 使用AIC标准进行评估,最佳模型变得相当复杂,类似于ARIMA(27,1,8)[我没有对参数空间进行详尽的搜索,但它似乎在那里的最小值]。 I am having real trouble validating and forecasting with this model though, because it takes a very long time (hours) to train a single model instance, so doing repeated tests is very difficult. 我在使用此模型进行验证和预测时遇到了麻烦,因为训练单个模型实例需要很长时间(小时),因此进行重复测试非常困难。

In any case, what I really need as a minimum in order to be able to use statsmodels in operations (assuming I can get the model validated somehow first) is an mechanism for incorporating new data as it arrives in order to make the next set of forecasts. 在任何情况下,我真正需要的是为了能够在操作中使用statsmodels(假设我可以首先以某种方式验证模型)是一种机制,用于在新数据到达时合并以生成下一组预测。 I would like to be able to fit a model on the available data, pickle it, and then unpickle later when the next datapoint is available and incorporate that into an updated set of forecasts. 我希望能够在可用数据上拟合模型,对其进行选择,然后在下一个数据点可用时进行解开,并将其合并到更新的预测集中。 At the moment I have to re-fit the model each time new data becomes available, which as I said takes a very long time. 目前,每当新数据可用时我都必须重新调整模型,正如我所说,这需要很长时间。

I had a look at this question which address essentially the problem I have but for ARMA models. 我看了一下这个问题 ,主要解决了我对ARMA模型的问题。 For the ARIMA case however there is the added complexity of the data being differenced. 然而,对于ARIMA情况,存在增加的数据复杂性的差异。 I need to be able to produce new forecasts of the original timeseries (cf typ='levels' keyword in the ARIMAResultsWrapper.predict method). 我需要能够生成原始时间序列的新预测(参见ARIMAResultsWrapper.predict方法中的typ ='levels'关键字)。 It's my understanding that statsmodels cannot do this at present, but what components of the existing functionality would I need to use in order to write something to do this myself? 我的理解是,statsmodels目前无法做到这一点,但是我需要使用现有功能的哪些组件才能自己写一些东西来做这件事?

Edit: I am also using transparams=True, so the prediction process needs to be able to transform the predictions back into the original timeseries, which is an additional difficulty in a homebrew approach. 编辑:我也使用transparams = True,因此预测过程需要能够将预测转换回原始时间序列,这是自制方法中的另一个难点。

An ARIMA(27,1,8) model is extremely complex, in the scheme of things. 在事物的方案中,ARIMA(27,1,8)模型非常复杂。 For most time series, you can do reasonable prediction with five or so parameters. 对于大多数时间序列,您可以使用五个左右的参数进行合理的预测。 Of course it depends on the data and domain, but I'm very skeptical that 27 + 8 = 35 parameters are necessary. 当然这取决于数据和领域,但我非常怀疑27 + 8 = 35个参数是必要的。

The AIC is occasionally known to be too permissive with number of parameters. 偶尔会知道AIC对参数的数量过于宽松。 I'd try comparing results with BIC. 我会尝试将结果与BIC进行比较。

I'd also look into whether your data has seasonality of some kind. 我还会研究一下您的数据是否具有某种季节性。 Eg, maybe all 27 of those AR terms don't matter, and you really just need lag=1, and lag=24 (for instance). 例如,这些AR术语中的所有27个都无关紧要,您实际上只需要滞后= 1和滞后= 24(例如)。 That might be the case for hourly data that has daily seasonality. 这可能是每日季节性的每小时数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python中的Statsmodels包 - 检索ARIMA模型的样本外预测的问题 - Statsmodels package in Python - issues with retrieving out-of-sample prediction of ARIMA model 使用statsmodels进行ARMA样本外预测 - ARMA out-of-sample prediction with statsmodels Statsmodels OLS get_prediction 对样本外数据 - Statsmodels OLS get_prediction on out-of-sample data 返回StatsModel中样本外预测的标准和置信区间 - Return std and confidence intervals for out-of-sample prediction in StatsModels python statsmodels arima预测真实数据选项 - python statsmodels arima prediction real data option 如何使用 statsmodels 0.14.0 进行 plot ARIMA 预测/预测 - How to plot ARIMA prediction/forecast with statsmodels 0.14.0 Statsmodels ARIMA:如何获得置信/预测区间? - Statsmodels ARIMA: how to get confidence/prediction interval? 使用 statsmodels.tsa.arima.model import ARIMA 进行时间序列预测 - Time Series Prediction with statsmodels.tsa.arima.model import ARIMA statsmodels:使用公式可提供给result.predict()的样本外预测的允许格式是什么 - statsmodels: What are the allowable formats to give to result.predict() for out-of-sample prediction using formula Python 中 ARIMA 的样本内预测区间 - In-sample prediction interval for ARIMA in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM