簡體   English   中英

R中的時間序列預測; 在初始預測后繪制“事件”並生成具有指定日期范圍的新預測圖

[英]Time series forecasting in R; plotting "events" and generating new forecasting plots with specified date range after initial forecast

我創建了一個函數,它允許我使用fable包進行時間序列預測。 該函數的想法是在特定日期/事件之后分析觀察值與預測值。 這是一個模擬數據框,它生成一列日期:-

set.seed(1)
df <- data.frame(Date = sort(sample(seq(as.Date('2018/01/01'), as.Date('2020/09/17'), by="day"),1368883, replace = T)))

這是我創建的函數。 您指定數據,然后是事件日期,然后是以天為單位的預測期,最后是圖表的標題。

event_analysis<-function(data,eventdate,period,title){
  require(dplyr)
  require(tsibble)
  require(fable)
  require(fabletools)
  require(imputeTS)
  require(ggplot2)
  data_count<-data%>%
    group_by(Date)%>%
    summarise(Count=n())
  
  data_count<-as_tsibble(data_count)
  data_count<-na_mean(data_count)
  
  
  train <- data_count %>%
    #sample_frac(0.8)
    filter(Date<=as.Date(eventdate))
  
  fit <- train %>%
    model(
      ets = ETS(Count),
      arima = ARIMA(Count),
      snaive = SNAIVE(Count)
    ) %>%
    mutate(mixed = (ets + arima + snaive) / 3)
  
  
  fc <- fit %>% forecast(h = period)
  
  forecastplot<-fc %>%
    autoplot(data_count, level = NULL)+ggtitle(title)+
    geom_vline(xintercept = as.Date(eventdate),linetype="dashed",color="red")+
    labs(caption = "Red dashed line = Event occurrence")
                                                                 
  
  fc_accuracy<-accuracy(fc,data_count)
  
  #obs<-data_count
  #colnames(obs)[2]<-"Observed"
  #obs_pred<-merge(data_count,fc_accuracy, by="Date")
  return(list(forecastplot,fc_accuracy,fc))
}

在一次運行中,我指定了df 、事件日期、我想要預測的天數(3 周),然后是標題:-

event_analysis(df, "2020-01-01",21,"Event forecast")

這將打印這個結果和情節: -

在此處輸入圖片說明

在此處輸入圖片說明

我承認我制作的模擬數據並不完全理想,但該功能在我的真實世界數據上運行良好。

這是我想要實現的目標。 我想要這個函數的輸出,但此外,我想要一個額外的圖表,它可以“放大”預測的時間段,原因有兩個:-

  1. 為了便於解釋
  2. 我希望能夠看到事件日期之前的N 天數和事件日期之后的 N 天數(N 代表預測期,即 21)。

所以,一個額外的圖表(連同原始的完整預測)看起來像這樣,也許在一個輸出中,“多圖”風格:-

在此處輸入圖片說明

另一件事是打印另一個輸出,該輸出顯示測試集中的觀察值與預測中使用的模型的預測值。

這些基本上是我想添加到我的函數中的另外兩件事,但我不知道如何去做。 任何幫助都非常感謝:)。

我想你可以這樣重寫它。 我做了一些調整來幫助你。

set.seed(1)
df <- data.frame(Date = sort(sample(seq(as.Date('2018/01/01'), as.Date('2020/09/17'), by="day"),1368883, replace = T)))

event_analysis <- function(data, eventdate, period, title){
 
 # in the future, you may think to move them out
 library(dplyr)
 library(tsibble)
 library(fable)
 library(fabletools)
 library(imputeTS)
 library(ggplot2)
 
 # convert at the beginning
 eventdate <- as.Date(eventdate)
 
 # more compact sintax
 data_count <- count(data, Date, name = "Count")
 
 # better specify the date variable to avoid the message
 data_count <- as_tsibble(data_count, index = Date)
 
 # you need to complete missing dates, just in case
 data_count <- tsibble::fill_gaps(data_count)
 
 
 data_count <- na_mean(data_count)
 
 
 train <- data_count %>%
  filter(Date <= eventdate)
 
 test <- data_count %>%
  filter(Date > eventdate, Date <= (eventdate+period))
 
  fit <- train %>%
  model(
   ets    = ETS(Count),
   arima  = ARIMA(Count),
   snaive = SNAIVE(Count)
  ) %>%
  mutate(mixed = (ets + arima + snaive) / 3)
 
 
 fc <- fit %>% forecast(h = period)
 

 # your plot
 forecastplot <- fc %>%
  autoplot(data_count, level = NULL) + 
  ggtitle(title) +
  geom_vline(xintercept = as.Date(eventdate), linetype = "dashed", color = "red") +
  labs(caption = "Red dashed line = Event occurrence")
 
 
 # plot just forecast and test
 zoomfcstplot <- autoplot(fc) + autolayer(test, .vars = Count)
 
 fc_accuracy <- accuracy(fc,data_count)
 

 ### EDIT: ###

 # results vs test
 res <- fc %>% 
  as_tibble() %>% 
  select(-Count) %>% 
  tidyr::pivot_wider(names_from = .model, values_from = .mean) %>% 
  inner_join(test, by = "Date")

 ##############
 

 return(list(forecastplot = forecastplot,
             zoomplot     = zoomfcstplot,
             accuracy     = fc_accuracy,
             forecast     = fc,
             results      = res))
}


event_analysis(df, 
               eventdate = "2020-01-01",
               period    = 21,
               title     = "Event forecast")


輸出:

#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> Carico il pacchetto richiesto: fabletools
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
#> $forecastplot

在此處輸入圖片說明

#> 
#> $zoomplot

在此處輸入圖片說明

#> 
#> $accuracy
#> # A tibble: 4 x 9
#>   .model .type    ME  RMSE   MAE   MPE  MAPE  MASE    ACF1
#>   <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
#> 1 arima  Test  -16.8  41.8  35.2 -1.31  2.61 0.791  0.164 
#> 2 ets    Test  -16.8  41.8  35.2 -1.31  2.61 0.791  0.164 
#> 3 mixed  Test  -21.9  44.7  36.8 -1.68  2.73 0.825 -0.0682
#> 4 snaive Test  -32.1  57.3  46.6 -2.43  3.45 1.05  -0.377 
#> 
#> $forecast
#> # A fable: 84 x 4 [1D]
#> # Key:     .model [4]
#>    .model Date               Count .mean
#>    <chr>  <date>            <dist> <dbl>
#>  1 ets    2020-01-02 N(1383, 1505) 1383.
#>  2 ets    2020-01-03 N(1383, 1505) 1383.
#>  3 ets    2020-01-04 N(1383, 1505) 1383.
#>  4 ets    2020-01-05 N(1383, 1505) 1383.
#>  5 ets    2020-01-06 N(1383, 1505) 1383.
#>  6 ets    2020-01-07 N(1383, 1505) 1383.
#>  7 ets    2020-01-08 N(1383, 1505) 1383.
#>  8 ets    2020-01-09 N(1383, 1505) 1383.
#>  9 ets    2020-01-10 N(1383, 1505) 1383.
#> 10 ets    2020-01-11 N(1383, 1505) 1383.
#> # ... with 74 more rows
#>
#> $results
#> # A tibble: 21 x 6
#>    Date         ets arima snaive mixed Count
#>    <date>     <dbl> <dbl>  <dbl> <dbl> <int>
#>  1 2020-01-02 1383. 1383.   1386 1384.  1350
#>  2 2020-01-03 1383. 1383.   1366 1377.  1398
#>  3 2020-01-04 1383. 1383.   1426 1397.  1357
#>  4 2020-01-05 1383. 1383.   1398 1388.  1415
#>  5 2020-01-06 1383. 1383.   1431 1399.  1399
#>  6 2020-01-07 1383. 1383.   1431 1399.  1346
#>  7 2020-01-08 1383. 1383.   1350 1372.  1299
#>  8 2020-01-09 1383. 1383.   1386 1384.  1303
#>  9 2020-01-10 1383. 1383.   1366 1377.  1365
#> 10 2020-01-11 1383. 1383.   1426 1397.  1328
#> # ... with 11 more rows 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM