[英]How do I pass a column name from a dataframe into a function using tidyverse syntax?
cars %>%
group_by(cyl) %>%
summarise_each(funs(mean(., na.rm = TRUE),
min(., na.rm = TRUE),
max(., na.rm = TRUE),
sd(., na.rm = TRUE)),
mpg, wt)
I want to turn the above code into a function, where the dataframe (cars) and the column (cyl) are arguments.我想把上面的代码变成一个function,其中dataframe(汽车)和列(cyl)是arguments。 How can I do this in R?如何在 R 中执行此操作?
I tried the following below but this does not evaluate我在下面尝试了以下方法,但这没有评估
plot_cars <- function (df, col) {
df %>%
group_by(col) %>%
summarise_each(funs(mean(., na.rm = TRUE),
min(., na.rm = TRUE),
max(., na.rm = TRUE),
sd(., na.rm = TRUE)),
mpg, wt)
}
plot_cars(cars,"cyl")
You can try this:你可以试试这个:
plot_cars <- function (df,...) {
dots <- enquos(...)
df %>%
group_by(vars(!!!dots)) %>%
summarise_each(funs(mean(., na.rm = TRUE),
min(., na.rm = TRUE),
max(., na.rm = TRUE),
sd(., na.rm = TRUE)),
mpg, wt)
}
plot_cars(mtcars,cyl)
# A tibble: 1 x 9
`vars(cyl)` mpg_mean wt_mean mpg_min wt_min mpg_max wt_max mpg_sd wt_sd
<quos> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 cyl 20.1 3.22 10.4 1.51 33.9 5.42 6.03 0.978
This is a similar solution using lazy evaluation.这是使用惰性评估的类似解决方案。
plot_cars <- function (df, ...) {
df %>%
group_by_(.dots = lazyeval::lazy_dots(...)) %>%
summarise_each(funs(mean(., na.rm = TRUE),
min(., na.rm = TRUE),
max(., na.rm = TRUE),
sd(., na.rm = TRUE)),
mpg, wt)
plot_cars(mtcars, cyl)
}
If you'd like to read more about this, follow this link: https://medium.com/optima-blog/writing-your-own-dplyr-functions-a1568720db0d如果您想了解更多相关信息,请点击以下链接: https://medium.com/optima-blog/writing-your-own-dplyr-functions-a1568720db0d
foo = function(d, grp, ...) {
d %>%
group_by_at(grp) %>%
summarise_at(c(...), funs(mean(., na.rm = TRUE),
min(., na.rm = TRUE),
max(., na.rm = TRUE),
sd(., na.rm = TRUE)))
}
foo(mtcars, "cyl", "mpg", "wt")
If we need to pass a string, convert to sym
and evaluate.如果我们需要传递一个字符串,请转换为sym
并进行评估。 But, to be more flexible, it is better to convert to ensym
so that it can be take both unquoted and quoted但是,为了更灵活,最好转换为ensym
以便它可以同时使用 unquoted 和quoted
library(dplyr)#1.0.0
plot_cars <- function (df, col) {
col <- ensym(col)
df %>%
group_by(!!col) %>%
summarise(across(c(mpg, wt),
list(mean = ~ mean(., na.rm = TRUE),
min = ~ min(., na.rm = TRUE),
max = ~ max(., na.rm = TRUE),
sd = ~ sd(., na.rm = TRUE))))
}
plot_cars(mtcars,"cyl")
# A tibble: 3 x 9
# cyl mpg_mean mpg_min mpg_max mpg_sd wt_mean wt_min wt_max wt_sd
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 26.7 21.4 33.9 4.51 2.29 1.51 3.19 0.570
#2 6 19.7 17.8 21.4 1.45 3.12 2.62 3.46 0.356
#3 8 15.1 10.4 19.2 2.56 4.00 3.17 5.42 0.759
plot_cars(mtcars, cyl)
# A tibble: 3 x 9
# cyl mpg_mean mpg_min mpg_max mpg_sd wt_mean wt_min wt_max wt_sd
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 26.7 21.4 33.9 4.51 2.29 1.51 3.19 0.570
#2 6 19.7 17.8 21.4 1.45 3.12 2.62 3.46 0.356
#3 8 15.1 10.4 19.2 2.56 4.00 3.17 5.42 0.759
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.