[英]Given a tsibble with more than one key, is tidyverts able to box_cox() each time series using a respective lambda_guerrero value per time series?
My question is: if I had a tsibble with more than one key (n_keys > 1), and either one or more key variables (key_vars >= 1), is the tidyverts suite able to perform a box_cox transformation on each time series (one box_cox transformation per time series) using a respective lambda_guerrero value per time series?我的问题是:如果我有一个包含多个键(n_keys > 1)和一个或多个键变量(key_vars >= 1)的 tsibble,tidyverts 套件是否能够对每个时间序列执行 box_cox 转换(一个每个时间序列的 box_cox 转换)使用每个时间序列的相应 lambda_guerrero 值? Below is my (first) attempt at a minimally reproducible example.
下面是我(第一次)尝试一个最小可重现的例子。
For example: I'm wondering if "step 5" is possible using the tidyverts suite without receiving an error.例如:我想知道“步骤 5”是否可以使用 tidyverts 套件而不会收到错误。 Rather than apply lambda1=0.36 to concessional, general, and aggregated, as seen in "step 4" without error, I'd like to apply 0.25 to concessional, 0.66 to general, and 0.36 to aggregated, if possible.
与其将 lambda1=0.36 应用于优惠、一般和聚合,如“步骤 4”中所见,没有错误,我想将 0.25 应用于优惠,0.66 应用于一般,0.36 应用于聚合,如果可能的话。
Thank you!谢谢!
library(tidyverse)
library(lubridate)
library(tsibble)
library(tsibbledata)
library(fabletools)
library(fable)
library(feasts)
library(distributional)
tsibbledata::PBS %>% summarize(Cost = sum(Cost)) %>% autoplot(Cost)
Siimilar to an example in FPP3 Chapter 3.1.类似于 FPP3 第 3.1 章中的示例。 For reference: https://otexts.com/fpp3/transformations.html
供参考: https://otexts.com/fpp3/transformations.html
lambda1 <- tsibbledata::PBS %>%
summarize(Cost = sum(Cost)) %>%
features(Cost, features = guerrero) %>%
pull(lambda_guerrero) # [1] 0.3642197
tsibbledata::PBS %>% summarize(Cost = sum(Cost)) %>% autoplot(box_cox(Cost,lambda1))
tsibbledata::PBS %>% aggregate_key(Concession, Cost = sum(Cost)) %>% autoplot(Cost)
tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
autoplot(box_cox(Cost,lambda1))
lambda2 <- tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
features(Cost, features = guerrero) %>%
pull(lambda_guerrero) # [1] 0.2518823 0.6577645 0.3642197
lambda2
A tibble: 3 x 2
Concession lambda_guerrero
<chr*> <dbl>
1 Concessional 0.252
2 General 0.658
3 <aggregated> 0.364
tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
autoplot(box_cox(Cost,lambda2)) # caused an error
The issue with your last attempt is related to the length of the values inputted into box_cox(Cost, lambda2)
.您上次尝试的问题与输入到
box_cox(Cost, lambda2)
的值的长度有关。 Cost
has length 612 (204 observations for 3 series), and lambda2
has length 3. So R will try to replicate the values in lambda2 so that the lengths match (called "recycling"). Cost
的长度为 612(3 个系列的 204 个观测值),而lambda2
的长度为 3。因此 R 将尝试复制 lambda2 中的值以使长度匹配(称为“回收”)。
However, it does this wrong in this case.但是,在这种情况下它做错了。 It matches
Cost[1]
with lambda2[1]
(correct), Cost[2]
with lambda2[2]
(incorrect), Cost[3]
with lambda2[3]
(incorrect), Cost[3]
with lambda2[1]
(correct), etc. The correct recycling of the parameters is Cost[1:204]
uses lambda2[1]
, Cost[205:408]
with lambda2[2]
, and Cost[409:612]
with lambda2[3]
, so we need to ensure this.它匹配
Cost[1]
和lambda2[1]
(正确), Cost[2]
和lambda2[2]
(不正确), Cost[3]
和lambda2[3]
(不正确), Cost[3]
和lambda2[1]
(正确)等。正确回收参数是Cost[1:204]
使用lambda2[1]
, Cost[205:408]
使用lambda2[2]
, Cost[409:612]
使用lambda2[3]
,所以我们需要确保这一点。
This can be done with rep(lambda2, each = 204)
, however the best/safest approach is to use a join operation.这可以通过
rep(lambda2, each = 204)
来完成,但是最好/最安全的方法是使用连接操作。 This ensures that the parameter matches the correct series (and prevents issues with row ordering).这可确保参数与正确的系列匹配(并防止出现行排序问题)。 The code below shows how this can be done with
left_join()
, which matches the lambda values to the data based on the Concession column.下面的代码显示了如何使用
left_join()
完成此操作,它将 lambda 值与基于 Concession 列的数据相匹配。 Note that the plot doesn't look very good as the transformations (and data) produce values on very different scales.请注意,plot 看起来不太好,因为转换(和数据)在非常不同的尺度上产生值。 To fix this I recommend facetting to produce different y-axis scales for each series (as done below also).
为了解决这个问题,我建议为每个系列生成不同的 y 轴刻度(如下所示)。
library(fpp3)
lambda2 <- tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
features(Cost, features = guerrero)
lambda2
#> # A tibble: 3 x 2
#> Concession lambda_guerrero
#> <chr*> <dbl>
#> 1 Concessional 0.252
#> 2 General 0.658
#> 3 <aggregated> 0.364
tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
# Add lambda to the dataset, matching based on the key variable
left_join(lambda2, by = "Concession") %>%
autoplot(box_cox(Cost, lambda_guerrero))
tsibbledata::PBS %>%
aggregate_key(Concession, Cost = sum(Cost)) %>%
# Add lambda to the dataset, matching based on the key variable
left_join(lambda2, by = "Concession") %>%
autoplot(box_cox(Cost, lambda_guerrero)) +
facet_grid(rows = vars(Concession), scales = "free_y")
Created on 2021-01-09 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2021 年 1 月 9 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.