![](/img/trans.png)
[英]R time-series forecasting with auto.arima and xreg=explanatory variables
[英]forecasting multiple time series in R using auto.arima
我的数据集有以下 3 列:
date client_id sales
01/01/2012 client 1 $1000
02/01/2012 client 1 $900
...
...
12/01/2014 client 1 $1000
01/01/2012 client 2 $300
02/01/2012 client 2 $450
...
..
12/01/2014 client 2 $375
等等其他 98 个客户(每个客户每月 24 个数据点)
我有多个客户(大约 100 个)...每个客户的数据采用时间序列格式(每月 24 个数据点)
如何使用 R 中的 auto.arima 自动预测所有 100 个客户的销售额? 有 by 语句选项吗? 还是我必须使用循环?
谢谢
您始终可以使用lapply()
:
lapply(tsMat, function(x) forecast(auto.arima(x)))
一个小例子如下:
library(forecast)
#generate some time-series:
sales <- replicate(100,
arima.sim(n = 24, list(ar = c(0.8), ma = c(-0.2)), sd = sqrt(0.1))
)
dates <- seq(as.Date("2012/1/1"), by = "month", length.out=24)
df <- data.frame(date=rep(dates,100), client_id=rep(1:100,each=24), sales=c(sales))
#reshape and convert it to a proper time-series format like ts:
tsMat <- ts(reshape2::dcast(df, date~client_id), start=2012, freq=12)
#forecast by auto.arima:
output <- lapply(tsMat, function(x) forecast(auto.arima(x)))
您还可以通过在预测调用中使用 'h=#ofPeriods' 来指定将来要预测的数字
Forecast.allStates <- as.data.frame(lapply(ts.allStates, function(x) forecast(auto.arima(x),h=67)))
另一种选择可能是tsibble
和fable
:
library(tsibble)
library(fable)
library(dplyr)
df %>%
as_tsibble(key = client_id, index = date) %>%
mutate(date = yearmonth(date)) %>%
model(arima = ARIMA(sales)) %>%
forecast(h = "1 year")
#> # A fable: 1,200 x 5 [1M]
#> # Key: client_id, .model [100]
#> client_id .model date sales .mean
#> <int> <chr> <mth> <dist> <dbl>
#> 1 1 arima 2014 gen N(0.072, 0.089) 0.0718
#> 2 1 arima 2014 feb N(0.28, 0.11) 0.281
#> 3 1 arima 2014 mar N(0.35, 0.12) 0.351
#> 4 1 arima 2014 apr N(0.024, 0.12) 0.0242
#> 5 1 arima 2014 mag N(-0.16, 0.12) -0.162
#> 6 1 arima 2014 giu N(0.029, 0.12) 0.0292
#> 7 1 arima 2014 lug N(0.24, 0.12) 0.243
#> 8 1 arima 2014 ago N(0.11, 0.12) 0.110
#> 9 1 arima 2014 set N(0.37, 0.12) 0.374
#> 10 1 arima 2014 ott N(0.37, 0.12) 0.369
#> # ... with 1,190 more rows
其中df
是:
set.seed(1)
sales <- replicate(100, arima.sim(n = 24, list(ar = c(0.8), ma = c(-0.2)), sd = sqrt(0.1)))
dates <- seq(as.Date("2012/1/1"), by = "month", length.out=24)
df <- data.frame(date=rep(dates,100), client_id=rep(1:100,each=24), sales=c(sales))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.