简体   繁体   English

Python - Facebook Prophet - Model 欠装

[英]Python - Facebook Prophet - Model Underfitting

I am running a prophet model to predict inbound call volumes.我正在运行一个先知 model 来预测呼入电话量。 I've spent a lot of time cleaning the data, running log scales, and hyperparameter tuning - which yielded on "okay" MAPE (Mean Average Percentage Error).我花了很多时间清理数据、运行对数刻度和超参数调整——这产生了“好的”MAPE(平均百分比误差)。

My problem at this point, is that the model is consistently underfitting.在这一点上,我的问题是 model 始终欠拟合。 Especially in the first 12 days of the month - and even more so in the first 6 days of the month.尤其是在本月的前 12 天 - 在本月的前 6 天更是如此。 Call volumes are always substantially higher on these days for operational reasons.由于运营原因,这些天的通话量总是要高得多。 They also start to build near the end of the month as volume ramps into the start of the following month.随着交易量进入下个月的开始,它们也开始在月底附近开始建造。

Actuals are the blue dots, forecast is the grey line.实际是蓝点,预测是灰线。 This is just one month, but its representative of the monthly seasonality in all other months:这只是一个月,但它代表了所有其他月份的每月季节性:

这

For sake of simplicity, I'm just going to include the model details and leave all the data cleansing processes out of the equation.为简单起见,我将仅包含 model 详细信息,并将所有数据清理过程排除在外。 I can provide more information if it would help, but the feedback I've gotten thus far is that the additional detail just muddies the waters.如果有帮助,我可以提供更多信息,但到目前为止我得到的反馈是,额外的细节只会让事情变得混乱。 Really the only thing that matters, the results below after running a boxcox transformation on the data that built the model, and a reverse boxcox on the data that came out of the model:真正唯一重要的是,在对构建 model 的数据运行 boxcox 转换后,对来自 model 的数据进行反向 boxcox 转换后的结果如下:

# Create Model
M = Prophet(
    changepoint_prior_scale = 15,
    changepoint_range = .8,
    growth='linear',
    seasonality_mode= 'multiplicative',
    daily_seasonality=False,
    weekly_seasonality=False,
    yearly_seasonality=False,
    holidays=Holidays
    ).add_seasonality(
        name='monthly',
        period=30.5,
        fourier_order = 20,
        prior_scale = 45
    ).add_seasonality(
        name='daily',
        period=1,
        fourier_order=75,
        prior_scale=20
    ).add_seasonality(
        name='weekly',
        period=7,
        fourier_order=75,
        prior_scale=30
    ).add_seasonality(
        name='yearly',
        period = 365.25,
        fourier_order = 30, 
        prior_scale = 15)

In general, I would like to improve the underfitting situation across the board - but especially at the beginning and end of the month.总的来说,我想全面改善欠拟合的情况——尤其是在月初和月底。 I've tried increasing the changepoint_range to loosen the model up, but the results weren't noticeable.我尝试增加 changepoint_range 以松开 model,但结果并不明显。 I've also tried increase the prior_scale of the "Monthly" seasonality, but nothing yielded results that were better than the screenshot above.我也尝试增加“每月”季节性的prior_scale,但没有比上面的截图更好的结果。

I'm at a little bit of a loss.我有点不知所措。 Is there a modeling technique that I could use with the FaceBook Prophet model to address this?有没有可以与 FaceBook Prophet model 一起使用的建模技术来解决这个问题? Is there a way to add a regressor that assigns specific seasonality to the first 12 days and last 7?有没有办法添加一个回归量,将特定的季节性分配给前 12 天和最后 7 天? I did some research, not sure if you can and/or how that would work.我做了一些研究,不确定你是否可以和/或如何工作。

Any help would be hugely appreciated.任何帮助将不胜感激。

Just as an update, I've tried jacking up the change_point range and the change point prior scale, had no impact.作为更新,我尝试提升 change_point 范围和更改点先验比例,但没有任何影响。 Going to try reducing the amount of training data (currently using 4 years).打算尝试减少训练数据量(目前使用 4 年)。

I think I found a workable solution, documenting it as an answer in case anyone else has a similar problem down the road.我想我找到了一个可行的解决方案,将其记录为答案,以防其他人在路上遇到类似的问题。

Since I knew this behavior was cyclical and I know why it exists (2 different monthly billing cycles in the beginning of the month and a recurring increase in volume at the end of the month that was being underfit), I used the Prophet documentation to create additional seasonal regressors for those specific periods.由于我知道这种行为是周期性的,并且我知道它为什么存在(月初有 2 个不同的月度计费周期,而月末的交易量反复增加,这是不合适的),我使用 Prophet 文档来创建这些特定时期的额外季节性回归量。

I started by defining the functions for the seasons (per the Prophet documentation, example was for NFL on-season and NFL off-season):我首先定义了季节的函数(根据 Prophet 文档,示例适用于 NFL 赛季和 NFL 休赛期):

def is_1st_billing_season(ds):
    date = pd.to_datetime(ds)
    return (date.day >= 1 and date.day <= 6)


def is_2nd_billing_season(ds):
    date = pd.to_datetime(ds)
    return (date.day >= 7 and date.day <= 12)


def EOM (ds):
    date = pd.to_datetime(ds)
    return (date.day >= 25 and date.day <= 31)

Then I applied the functions to my dataframe:然后我将这些函数应用于我的 dataframe:

#Create Additional Seasonal Categories
Box_Cox_Data['1st_season'] = Call_Data['ds'].apply(is_1st_billing_season)
Box_Cox_Data['2nd_season'] = Call_Data['ds'].apply(is_2nd_billing_season)
Box_Cox_Data['EOM'] = Call_Data['ds'].apply(EOM)

Then I updated my model to include the additional seasonal regressors:然后我更新了我的 model 以包括额外的季节性回归量:

# Create Model
M = Prophet(
    changepoint_prior_scale = 15,
    changepoint_range = .8,
    growth='linear',
    seasonality_mode= 'multiplicative',
    daily_seasonality=False,
    weekly_seasonality=False,
    yearly_seasonality=False,
    holidays=Holidays
    ).add_seasonality(
        name='monthly',
        period=30.5,
        fourier_order = 20,
        prior_scale = 45
    ).add_seasonality(
        name='daily_1st_season',
        period=1,
         fourier_order=75,
        prior_scale=20,
        condition_name='1st_season'
    ).add_seasonality(
        name='daily_2nd_season',
        period=1,
        fourier_order=75,
        prior_scale=20,
        condition_name='2nd_season'
    ).add_seasonality(
        name='daily_EOM_season',
        period=1,
        fourier_order=75,
        prior_scale=20,
        condition_name='EOM'
    ).add_seasonality(
        name='weekly',
        period=7,
        fourier_order=75,
        prior_scale=30
    ).add_seasonality(
        name='yearly',
        period = 365.25,
        fourier_order = 30, #CHECK THIS
        prior_scale = 15)
        
#Fit Model
M.fit(Box_Cox_Data)

# Create Future Dataframe (in Hours)
future = M.make_future_dataframe(freq='H', periods = Hours_Needed)
future['1st_season'] = future['ds'].apply(is_1st_billing_season)
future['2nd_season'] = future['ds'].apply(is_2nd_billing_season)
future['EOM'] = future['ds'].apply(EOM)

# Predict Future Values
forecast = M.predict(future)

The end result looks much better:最终结果看起来好多了:

在此处输入图像描述

For the sake of full transparency, this screenshot is for a slightly different period than the original screenshot.为了完全透明,此屏幕截图的时间与原始屏幕截图略有不同。 For this project, my starting point isn't ultra important (predictions for future periods are the primary focus) and I accidentally ran the cross-validation for a different timeframe, but the end result is a better fitting seasonal forecast across all time frames I have seen thus far.对于这个项目,我的起点并不是特别重要(对未来时期的预测是主要关注点),我不小心对不同的时间范围进行了交叉验证,但最终结果是在所有时间范围内更合适的季节预测我到目前为止已经看到了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM