繁体   English   中英

使用 auto.arima 在 R 中预测多个时间序列

[英]forecasting multiple time series in R using auto.arima

我的数据集有以下 3 列:

date client_id sales
01/01/2012 client 1 $1000
02/01/2012 client 1 $900
...
...
12/01/2014 client 1 $1000
01/01/2012 client 2 $300
02/01/2012 client 2 $450
...
..
12/01/2014 client 2 $375

等等其他 98 个客户(每个客户每月 24 个数据点)

我有多个客户(大约 100 个)...每个客户的数据采用时间序列格式(每月 24 个数据点)

如何使用 R 中的 auto.arima 自动预测所有 100 个客户的销售额? 有 by 语句选项吗? 还是我必须使用循环?

谢谢

您始终可以使用lapply()

lapply(tsMat, function(x) forecast(auto.arima(x)))

一个小例子如下:

library(forecast)
#generate some time-series:
sales <- replicate(100, 
    arima.sim(n = 24, list(ar = c(0.8), ma = c(-0.2)), sd = sqrt(0.1)) 
)
dates <- seq(as.Date("2012/1/1"), by = "month", length.out=24)
df <- data.frame(date=rep(dates,100), client_id=rep(1:100,each=24), sales=c(sales))
#reshape and convert it to a proper time-series format like ts:
tsMat <- ts(reshape2::dcast(df, date~client_id), start=2012, freq=12)
#forecast by auto.arima:
output <- lapply(tsMat, function(x) forecast(auto.arima(x)))

您还可以通过在预测调用中使用 'h=#ofPeriods' 来指定将来要预测的数字

Forecast.allStates <- as.data.frame(lapply(ts.allStates, function(x) forecast(auto.arima(x),h=67)))

另一种选择可能是tsibblefable

library(tsibble)
library(fable)
library(dplyr)

df %>%
   as_tsibble(key = client_id, index = date) %>%
   mutate(date = yearmonth(date)) %>% 
   model(arima = ARIMA(sales)) %>% 
   forecast(h = "1 year")
#> # A fable: 1,200 x 5 [1M]
#> # Key:     client_id, .model [100]
#>    client_id .model     date           sales   .mean
#>        <int> <chr>     <mth>          <dist>   <dbl>
#>  1         1 arima  2014 gen N(0.072, 0.089)  0.0718
#>  2         1 arima  2014 feb   N(0.28, 0.11)  0.281 
#>  3         1 arima  2014 mar   N(0.35, 0.12)  0.351 
#>  4         1 arima  2014 apr  N(0.024, 0.12)  0.0242
#>  5         1 arima  2014 mag  N(-0.16, 0.12) -0.162 
#>  6         1 arima  2014 giu  N(0.029, 0.12)  0.0292
#>  7         1 arima  2014 lug   N(0.24, 0.12)  0.243 
#>  8         1 arima  2014 ago   N(0.11, 0.12)  0.110 
#>  9         1 arima  2014 set   N(0.37, 0.12)  0.374 
#> 10         1 arima  2014 ott   N(0.37, 0.12)  0.369 
#> # ... with 1,190 more rows

其中df是:

set.seed(1)
sales <- replicate(100, arima.sim(n = 24, list(ar = c(0.8), ma = c(-0.2)), sd = sqrt(0.1)))
dates <- seq(as.Date("2012/1/1"), by = "month", length.out=24)
df <- data.frame(date=rep(dates,100), client_id=rep(1:100,each=24), sales=c(sales))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM