I have a data.frame
with different variables that needs to be summarised
with different measures.
I'm looking for a easy readable equivalent of,
baseline_table <- function(data,var) {
data %>%
group_by(Species) %>%
summarise(
!!sym(paste(var, "_mean", sep = "")) := !!sym(var) %>% mean(na.rm = TRUE),
!!sym(paste(var, "_sd", sep = "")) := !!sym(var) %>% sd(na.rm = TRUE)
)
}
iris %>%
baseline_table(var = "Sepal.Length")
You can use glue
syntax to make it more readable.
baseline_table <- function(data, var) {
data %>%
group_by(Species) %>%
summarise(
"{{var}}_mean" := mean({{ var }}, na.rm = TRUE),
"{{var}}_sd" := sd({{ var }}, na.rm = TRUE)
)
}
iris %>%
baseline_table(var = Sepal.Length)
You can use across
-
library(dplyr)
baseline_table <- function(data,var) {
data %>%
group_by(Species) %>%
summarise(across(all_of(var), list(mean = mean, sd = sd)))
}
iris %>% baseline_table(var = "Sepal.Length")
# Species Sepal.Length_mean Sepal.Length_sd
# <fct> <dbl> <dbl>
#1 setosa 5.01 0.352
#2 versicolor 5.94 0.516
#3 virginica 6.59 0.636
The benefit using across
is that you can apply this to more than one column.
iris %>% baseline_table(var = c("Sepal.Length", "Sepal.Width"))
# Species Sepal.Length_mean Sepal.Length_sd Sepal.Width_mean Sepal.Width_sd
# <fct> <dbl> <dbl> <dbl> <dbl>
#1 setosa 5.01 0.352 3.43 0.379
#2 versicolor 5.94 0.516 2.77 0.314
#3 virginica 6.59 0.636 2.97 0.322
If you want more customised column names take a look at .names
parameter in ?across
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.