简体   繁体   中英

Summarise multiple variables by one group at a time

There are a number of questions and answers about summarising multiple variables by one or more groups (eg, Means multiple columns by multiple groups ). I don't think this is a duplicate.

Here's what I'm trying to do: I want to calculate the mean for 4 variables by Displacement , then calculate the mean for those same three by Horsepower , and so on. I don't want to group by vs , am , gear , and carb simultaneously (ie, I'm not looking for simply mydata %>% group_by(vs, am, gear, and carb) %>% summarise_if(...) .

How can I calculate the means for a set of variables by Displacement , then calculate the means for that same set of variables by Horsepower , etc., then place in a table side by side?

I tried to come up with a reproducible example but couldn't. Here is a tibble from mtcars that shows what I'm ultimately looking for (data is made up):

tibble(Item = c("vs", "am" ,"gear", "carb"), 
   "Displacement (mean)"  = c(2.4, 1.4, 5.5, 1.3),
   "Horsepower (mean)" = c(155, 175, 300, 200))

Perhaps something like this using purrr::map and some rlang syntax?

grps <- list("cyl", "vs")
map(setNames(grps, unlist(grps)), function(x)
mtcars %>%
    group_by(!!rlang::sym(x)) %>%
    summarise(mean.mpg = mean(mpg), mean.disp = mean(disp)) %>%
    rename(id.val = 1)) %>%
bind_rows(.id = "id")
## A tibble: 5 x 4
#  id    id.val mean.mpg mean.disp
#  <chr>  <dbl>    <dbl>     <dbl>
#1 cyl       4.     26.7      105.
#2 cyl       6.     19.7      183.
#3 cyl       8.     15.1      353.
#4 vs        0.     16.6      307.
#5 vs        1.     24.6      132.

With so few groupings, why not do each set of means one at a time:

out1 <- mydata %>% group_by(Var1) %>% 
    summarise(mean_1a = mean(var_a), mean_1b = mean(var_b))

out2 <- mydata %>% group_by(Var2) %>% 
    summarise(mean_2a = mean(var_a), mean_2b = mean(var_b))

out3 <- mydata %>% group_by(Var3) %>% 
    summarise(mean_3a = mean(var_a), mean_3b = mean(var_b))

If it makes sense to place the results side-by-side, you could do so with something like:

result <- cbind(out1, out2, out3)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM