简体   繁体   English

使用寓言系列后期使用 xreg 进行分层预测

[英]Hierarchical forecast with xreg late in series using fable

I am using the great fable package and am trying to create a hierarchical forecast using arima and ets models, and reconciling with td, mo, bu, and min trace to compare and see what is the best approach.我正在使用伟大的fable package 并尝试使用 arima 和 ets 模型创建分层预测,并与 td、mo、bu 和 min 跟踪进行协调以比较并查看最佳方法。 My series has some effects late in the series that need to be regressed away and so I am trying to create a binary regressor to deal with that.我的系列在系列后期有一些需要回归的效果,所以我正在尝试创建一个二元回归器来处理这个问题。 I have read link1 and link2 about using the new_data argument to add a regressor with a hierarchical forecast, instead of the xreg argument which I've used for non-hierarchical forecast.我已经阅读了关于使用new_data参数添加具有分层预测的回归量的new_data 1 和xreg 参数 I've had success with this approach by splitting the data into train and test sets and passing the test to new_data as Rob Hyndman describes in link1.正如 Rob Hyndman 在链接 1 中所描述的那样,我通过将数据拆分为训练集和测试集并将测试传递给new_data的方法取得了成功。 The problem I am having with this current task is that the effects that need to be modeled away are all late in the series and so they are all in the test set.我在当前任务中遇到的问题是,需要建模的效果都在系列的后期,所以它们都在测试集中。

First here is my reproducible example data:首先是我可重现的示例数据:

library(tidyverse)
library(forecast)
library(fable)
library(tsibble)
library(tsibbledata)
library(lubridate)

data <- aus_livestock %>%
  filter(State %in% c("Tasmania", "New South Wales", "Queensland"),
         as.Date(Month) > as.Date("2000-01-01")) %>%
  aggregate_key(State, Count=sum(Count)) %>%
  mutate(xreg=as.factor(if_else(as.Date(Month) > as.Date("2018-01-01") &
                                  as.Date(Month) < as.Date("2018-10-01"), 1, 0)))

I have had success in the past doing something like this:我过去曾成功做过这样的事情:

train <- data %>%
  filter(as.Date(Month) < as.Date("2017-10-01"))

test <- data %>%
  filter(as.Date(Month) >= as.Date("2017-10-01"))

mod_data <- train %>%
  model(ets=ETS(Count),
        arima=ARIMA(Count ~ xreg)
        ) %>%
  reconcile(bu_ets=bottom_up(ets),
            td_ets=top_down(ets),
            mint_ets=min_trace(ets),
            bu_arima=bottom_up(arima),
            td_arima=top_down(arima),
            mint_arima=min_trace(arima)
            )

forc_data <- mod_data %>%
  forecast(new_data=test)

autoplot(forc_data,
         data,
         level=NULL)

But since in this case the regressor is all zeros in the train set this expectedly provides the error Provided exogenous regressors are rank deficient, removing regressors: xreg1 .但由于在这种情况下,训练集中的回归量全为零,这预计会提供错误Provided exogenous regressors are rank deficient, removing regressors: xreg1 I think what I need to do is feed all the data I have to the model, not split the data into train and test, but I am unsure how to forecast that model using fable when there is no data for the new_data file.我认为我需要做的是将我拥有的所有数据提供给 model,而不是将数据拆分为训练和测试,但我不确定如何在new_data文件没有数据时使用 fable 预测 model。 The closest I've gotten is something like this:我得到的最接近的是这样的:

dates <- sort(rep(seq(as.Date("2019-01-01"), as.Date("2020-12-01"), "months"), 3))

future_data <- tibble(
  Month=dates,
  State=rep(c("Tasmania", "New South Wales", "Queensland"), 24), 
  Count=0
) %>%
  mutate(Month=yearmonth(Month)) %>%
  as_tsibble(index=Month, key=State) %>%
  aggregate_key(State, Count=sum(Count)) %>%
  mutate(xreg=factor(0, levels=c(0, 1))) %>%
  select(-Count)

mod_data <- data %>%
  model(ets=ETS(Count),
        arima=ARIMA(Count ~ xreg)
        ) %>%
  reconcile(bu_ets=bottom_up(ets),
            td_ets=top_down(ets),
            mint_ets=min_trace(ets),
            bu_arima=bottom_up(arima),
            td_arima=top_down(arima),
            mint_arima=min_trace(arima)
            )

forc_data <- mod_data %>%
  forecast(new_data=future_data)

autoplot(forc_data,
         data,
         level=NULL)

Oddly this code causes my R Studio to crash when I try to run the forecast piece saying R session aborted R has encountered a fatal error . Oddly this code causes my R Studio to crash when I try to run the forecast piece saying R session aborted R has encountered a fatal error . I think this may be unrelated to the code because I actually got this to work on my real data but the forecasts dont look like I would expect.我认为这可能与代码无关,因为我实际上是让它在我的真实数据上工作,但预测看起来并不像我预期的那样。

So, in summary I would like to know how I can use fable to forecast a hierarchical series with an exogenous regressor when all the regression effects need to happen in the period of the test set.因此,总而言之,我想知道当所有回归效应都需要在测试集期间发生时,我如何使用fable来预测具有外生回归量的层次序列。

Thanks in advance for any help I can get!提前感谢我能得到的任何帮助!

I think it's not possible to do it only in the test set because then the model has nothing to learn from in the train set.我认为不可能只在测试集中这样做,因为 model 在训练集中没有什么可学习的。 Ie you can only include an exogenous variable in the training process if it is both present in the train and test set.即你只能在训练过程中包含一个外生变量,如果它同时存在于训练和测试集中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM