拟合回归 Model 到时间序列数据

Question

I am trying to fit a regression model to a time series data in Python (basically to predict the trend).我正在尝试将回归 model 拟合到 Python 中的时间序列数据（基本上是为了预测趋势）。 I have applied seasonal decomposition using statsmodels earlier which extracts data to its three components including the data trend.我之前使用statsmodels应用了季节性分解，它将数据提取到它的三个组成部分，包括数据趋势。 However, I would like to know how I can come up with the best fit to my data using statistical-based regressions (by defining any functions) and check the sum of squares to compare various models and select the best one which fits my data.但是，我想知道如何使用基于统计的回归（通过定义任何函数）得出最适合我的数据并检查平方和以比较各种模型和 select 最适合我的数据的模型。 I should mention that I am not looking for learning-based regressions which rely on training/testing data.我应该提一下，我不是在寻找依赖于训练/测试数据的基于学习的回归。 I would appreciate if anyone can help me with this or even introduces a tutorial for this issue.如果有人可以帮助我解决这个问题，甚至为这个问题介绍一个教程，我将不胜感激。

Answer 1

Since you mentioned:既然你提到：

I would like to know how I can come up with the best fit to my data using statistical-based regressions (by defining any functions) and check the sum of squares to compare various models and select the best one which fits my data.我想知道如何使用基于统计的回归（通过定义任何函数）得出最适合我的数据并检查平方和以比较各种模型和 select 最适合我的数据的模型。 I should mention that I am not looking for learning-based regressions which rely on training/testing data.我应该提一下，我不是在寻找依赖于训练/测试数据的基于学习的回归。

Maybe ARIMA (Auto Regressive Integrated Moving Average ) model with given setup (P,D,Q), which can learn on history and predict() / forecast() .也许ARIMA （自回归综合移动平均线）model 具有给定的设置（P，D，Q），它可以学习历史和predict() / forecast() 。 Please notice that split data into train and test are for sake of evaluation with approach of walk-forward validation:请注意，将数据拆分为训练和测试是为了使用前向验证方法进行评估：

from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
from statsmodels.tsa.arima_model import ARIMA
from sklearn.metrics import mean_squared_error
from math import sqrt
# load dataset
def parser(x):
    return datetime.strptime('190'+x, '%Y-%m')
series = read_csv('/content/shampoo.txt', header=0, index_col=0, parse_dates=True, squeeze=True, date_parser=parser)
series.index = series.index.to_period('M')
# split into train and test sets
X = series.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
# walk-forward validation
for t in range(len(test)):
    model = ARIMA(history, order=(5,1,0))
    model_fit = model.fit()
    output = model_fit.forecast()
    yhat = output[0]
    predictions.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))
# evaluate forecasts
rmse = sqrt(mean_squared_error(test, predictions))
rmse_ = 'Test RMSE: %.3f' % rmse

# plot forecasts against actual outcomes
pyplot.plot(test, label='test')
pyplot.plot(predictions, color='red', label='predict')
pyplot.xlabel('Months')
pyplot.ylabel('Sale')
pyplot.title(f'ARIMA model performance with {rmse_}')
pyplot.legend()
pyplot.show()

I used the same library package you mentioned with following outputs including Root Mean Square Error (RMSE) evaluation:我使用了您提到的同一个库 package 以及以下输出，包括均方根误差 (RMSE)评估：

import statsmodels as sm
sm.__version__ # '0.10.2'

Please see other post1 & post2 for further info.请参阅其他post1和post2了解更多信息。 Maybe you can add trend line too也许你也可以添加趋势线

拟合回归 Model 到时间序列数据

问题描述

1 个解决方案

解决方案1
0 2022-02-16 19:51:28

拟合回归 Model 到时间序列数据

问题描述

1 个解决方案

解决方案1 0 2022-02-16 19:51:28

解决方案1
0 2022-02-16 19:51:28