简体   繁体   English

ARIMA 模型中的负面结果

[英]Negative results in ARIMA model

I'm trying to predict daily revenue to end of month by learning previous month.我试图通过学习上个月来预测到月底的每日收入。 Due to different behavior of the revenue between workdays and weekends I decided to use time series model (ARIMA) in Python.由于工作日和周末之间的收入行为不同,我决定在 Python 中使用时间序列模型 (ARIMA)。

This is the my Python code that I'm using:这是我正在使用的 Python 代码:

import itertools
import pandas as pd
import numpy as np
from datetime import datetime, date, timedelta
import statsmodels.api as sm
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import calendar

data_temp = [['01/03/2020',53921.785],['02/03/2020',97357.9595],['03/03/2020',95353.56893],['04/03/2020',93319.6761999999],['05/03/2020',88835.79958],['06/03/2020',98733.0856000001],['07/03/2020',61501.03036],['08/03/2020',74710.00968],['09/03/2020',156613.20712],['10/03/2020',131533.9006],['11/03/2020',108037.3002],['12/03/2020',106729.43067],['13/03/2020',125724.79704],['14/03/2020',79917.6726599999],['15/03/2020',90889.87192],['16/03/2020',160107.93834],['17/03/2020',144987.72243],['18/03/2020',146793.40641],['19/03/2020',145040.69416],['20/03/2020',140467.50472],['21/03/2020',69490.18814],['22/03/2020',82753.85331],['23/03/2020',142765.14863],['24/03/2020',121446.77825],['25/03/2020',107035.29359],['26/03/2020',98118.19468],['27/03/2020',82054.8721099999],['28/03/2020',61249.91097],['29/03/2020',72435.6711699999],['30/03/2020',127725.50818],['31/03/2020',77973.61724]] 
panel = pd.DataFrame(data_temp, columns = ['Date', 'revenue'])

pred_result=pd.DataFrame(columns=['revenue'])
panel['Date']=pd.to_datetime(panel['Date'])
panel.set_index('Date', inplace=True)
ts = panel['revenue']

p = d = q = range(0, 2)
pdq = list(itertools.product(p, d, q))

seasonal_pdq = [(x[0], x[1], x[2], 7) for x in list(itertools.product(p, d, q))]
aic = float('inf')
for es in [True,False]:
    for param in pdq:
      for param_seasonal in seasonal_pdq:
        try:
          mod = sm.tsa.statespace.SARIMAX(ts,
                                          order=param,
                                          seasonal_order=param_seasonal,
                                          enforce_stationarity=es,
                                          enforce_invertibility=False)
          results = mod.fit()
          if results.aic<aic:
            param1=param
            param2=param_seasonal
            aic=results.aic
            es1=es
          #print('ARIMA{}x{} enforce_stationarity={} - AIC:{}'.format(param, param_seasonal,es,results.aic))
        except:
          continue
print('Best model parameters: ARIMA{}x{} - AIC:{} enforce_stationarity={}'.format(param1, param2, aic,es1))

mod = sm.tsa.statespace.SARIMAX(ts,
                                order=param1,
                                seasonal_order=param2,
                                enforce_stationarity=es1,
                                enforce_invertibility=False)
results = mod.fit()

pred_uc = results.get_forecast(steps=calendar.monthrange(datetime.now().year,datetime.now().month)[1]-datetime.now().day+1)
pred_ci = pred_uc.conf_int()
ax = ts.plot(label='observed', figsize=(12, 5))
pred_uc.predicted_mean.plot(ax=ax, label='Forecast')
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.25)
ax.set_xlabel('Date')
plt.legend()
plt.show()

predict=pred_uc.predicted_mean.to_frame()
predict.reset_index(inplace=True)
predict.rename(columns={'index': 'date',0: 'revenue_forcast'}, inplace=True)
display(predict)

The output looks like:输出看起来像: 在此处输入图片说明

How you can see the prediction results have negative value as result of negative slope.由于负斜率,您如何看到预测结果具有负值。

Since I'm trying to predict income, the result cannot be lower than zero, and the negative slope also looks very strange.由于我在尝试预测收入,结果不能低于零,负斜率也看起来很奇怪。

What's wrong with my method?我的方法有什么问题? How can I improve it?我该如何改进?

You can't force an ARIMA model to take only positive values.您不能强制 ARIMA 模型仅采用正值。 However, a classic 'trick' when you want to predict something that's always positive is to use a function that converts positive values to any value in R. The log function is a good example of this.但是,当您想要预测始终为正的某个值时,一个经典的“技巧”是使用将正值转换为 R 中任何值的函数。对log函数就是一个很好的例子。

panel['log_revenue'] = np.log(panel['revenue'])

And predict now log_revenue column.现在预测log_revenue列。

Now if the predictions take negative values, that's ok because your prediction is actually np.exp(predict) , which is positive.现在,如果预测采用负值,那没关系,因为您的预测实际上是np.exp(predict) ,这是正值。

The solution is to take more data history.解决方案是获取更多数据历史记录。

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM