[英]R dplyr mutate multiple columns using custom function to create new column
[英]Add new columns with custom function using mutate
我想做一个简单的操作,并为此使用 dplyr mutate 添加一个新列。 基本上我有一个包含很多列的 DF,我想要 select 其中一些,只是包含 hist_avg、tgt_ 和 monthyl_X_ly 的那些。 这应该很简单,添加一个以“fct_”+ metric 开头的新列应该不是问题。 但是,正如您在下面看到的,它添加了列但名称很奇怪(fct_visits$hist_avg_visits 和 fct_revenue$hist_avg_revenue_lcy)。
另外,不确定,但我尝试使用 mutate + cross 来实现它,因为它可以为我节省很多代码行并且无法弄清楚如何做到这一点。
library(tidyverse)
(example <- tibble(brand = c("Brand A", "Brand A", "Brand A", "Brand A", "Brand A"),
country = c("Country A", "Country A", "Country A", "Country A", "Country A"),
date = c("2020-08-01", "2020-08-02", "2020-08-03", "2020-08-04", "2020-08-05"),
visits = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
visits_ly = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
tgt_visits = c(2491306, 2491306, 2491306, 2491306, 2491306),
hist_avg_visits = c(177185, 175758, 225311, 210871, 197405),
monthly_visits_ly = c(3765612, 3765612, 3765612, 3765612, 3765612),
revenue_lcy = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
revenue_ly = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
tgt_revenue_lcy = c(48872737, 48872737, 48872737, 48872737, 48872737),
hist_avg_revenue_lcy = c(231101, 222236, 276497, 259775, 251167),
monthly_revenue_lcy_ly = c(17838660, 17838660, 17838660, 17838660, 17838660))) %>%
print(width = Inf)
#> # A tibble: 5 x 13
#> brand country date visits visits_ly tgt_visits hist_avg_visits
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Brand A Country A 2020-08-01 NA NA 2491306 177185
#> 2 Brand A Country A 2020-08-02 NA NA 2491306 175758
#> 3 Brand A Country A 2020-08-03 NA NA 2491306 225311
#> 4 Brand A Country A 2020-08-04 NA NA 2491306 210871
#> 5 Brand A Country A 2020-08-05 NA NA 2491306 197405
#> monthly_visits_ly revenue_lcy revenue_ly tgt_revenue_lcy hist_avg_revenue_lcy
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 3765612 NA NA 48872737 231101
#> 2 3765612 NA NA 48872737 222236
#> 3 3765612 NA NA 48872737 276497
#> 4 3765612 NA NA 48872737 259775
#> 5 3765612 NA NA 48872737 251167
#> monthly_revenue_lcy_ly
#> <dbl>
#> 1 17838660
#> 2 17838660
#> 3 17838660
#> 4 17838660
#> 5 17838660
first_forecast <- function(dataset, metric) {
avg_metric <- select(dataset, paste0("hist_avg_", metric))
tgt_metric <- select(dataset, paste0("tgt_", metric))
monthly_metric <- select(dataset, paste0("monthly_", metric, "_ly"))
output <- avg_metric * (tgt_metric / monthly_metric)
return(output)
}
example %>%
mutate(fct_visits = first_forecast(., "visits"),
fct_revenue = first_forecast(., "revenue_lcy")) %>%
print(width = Inf)
#> # A tibble: 5 x 15
#> brand country date visits visits_ly tgt_visits hist_avg_visits
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Brand A Country A 2020-08-01 NA NA 2491306 177185
#> 2 Brand A Country A 2020-08-02 NA NA 2491306 175758
#> 3 Brand A Country A 2020-08-03 NA NA 2491306 225311
#> 4 Brand A Country A 2020-08-04 NA NA 2491306 210871
#> 5 Brand A Country A 2020-08-05 NA NA 2491306 197405
#> monthly_visits_ly revenue_lcy revenue_ly tgt_revenue_lcy hist_avg_revenue_lcy
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 3765612 NA NA 48872737 231101
#> 2 3765612 NA NA 48872737 222236
#> 3 3765612 NA NA 48872737 276497
#> 4 3765612 NA NA 48872737 259775
#> 5 3765612 NA NA 48872737 251167
#> monthly_revenue_lcy_ly fct_visits$hist_avg_visits
#> <dbl> <dbl>
#> 1 17838660 117225.
#> 2 17838660 116280.
#> 3 17838660 149064.
#> 4 17838660 139511.
#> 5 17838660 130602.
#> fct_revenue$hist_avg_revenue_lcy
#> <dbl>
#> 1 633149.
#> 2 608862.
#> 3 757521.
#> 4 711708.
#> 5 688124.
由reprex package (v0.3.0) 于 2020 年 7 月 28 日创建
指向@Onyambu 的伟大建议,您的代码的最后一部分应该是这样的:
example %>%
cbind(fct_visits = first_forecast(., "visits"),
fct_revenue = first_forecast(., "revenue_lcy")) %>%
print(width = Inf)
brand country date visits visits_ly tgt_visits hist_avg_visits monthly_visits_ly revenue_lcy
1 Brand A Country A 2020-08-01 NA NA 2491306 177185 3765612 NA
2 Brand A Country A 2020-08-02 NA NA 2491306 175758 3765612 NA
3 Brand A Country A 2020-08-03 NA NA 2491306 225311 3765612 NA
4 Brand A Country A 2020-08-04 NA NA 2491306 210871 3765612 NA
5 Brand A Country A 2020-08-05 NA NA 2491306 197405 3765612 NA
revenue_ly tgt_revenue_lcy hist_avg_revenue_lcy monthly_revenue_lcy_ly hist_avg_visits hist_avg_revenue_lcy
1 NA 48872737 231101 17838660 117224.5 633149.5
2 NA 48872737 222236 17838660 116280.4 608862.0
3 NA 48872737 276497 17838660 149064.4 757521.3
4 NA 48872737 259775 17838660 139511.0 711707.9
5 NA 48872737 251167 17838660 130601.9 688124.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.