I want to do a group_by + summarise
operation on only two columns with one group attribute while keeping the other three columns unchanged which have the same number for every row. How can I do that? eg
> data<- data.frame(a=1:10, b=rep(1,10), c=rep(2,10), d=rep(3,10), e= c("small", "med", "larg", "larg", "larg", "med", "small", "small", "small", "med"))
> data %>% group_by(e) %>% summarise(a=mean(a))
# A tibble: 3 × 2
e a
<chr> <dbl>
1 larg 4
2 med 6
3 small 6.25
but I want
# A tibble: 3 × 5
e a b c d
<chr> <dbl> <dbl> <dbl> <dbl>
1 larg 4 1 2 3
2 med 6 1 2 3
3 small 6.25 1 2 3
group_by + summarise
always drops other columns. How can I do that?
Add the other columns to group_by
:
> library(tidyverse)
> data <- data.frame(a=1:10, b=rep(1,10), c=rep(2,10), d=rep(3,10), e= c("small", "med", "larg", "larg", "larg", "med", "small", "small", "small", "med"))
> data %>% group_by(e, b, c, d) %>% summarise(a=mean(a))
`summarise()` has grouped output by 'e', 'b', 'c'. You can override using the `.groups` argument.
# A tibble: 3 x 5
# Groups: e, b, c [3]
e b c d a
<chr> <dbl> <dbl> <dbl> <dbl>
1 larg 1 2 3 4
2 med 1 2 3 6
3 small 1 2 3 6.25
And you can always calculate a new variable with group + summarise
and keep the rest of your dataframe "intact" adding across()
in the summarise. This could be useful if your other variables arent going to be the same always.
data %>% group_by(e) %>%
summarise(a=mean(a), across())
# A tibble: 10 x 5
# Groups: e [3]
e a b c d
<chr> <dbl> <dbl> <dbl> <dbl>
1 larg 4 1 2 3
2 larg 4 1 2 3
3 larg 4 1 2 3
4 med 6 1 2 3
5 med 6 1 2 3
6 med 6 1 2 3
7 small 6.25 1 2 3
8 small 6.25 1 2 3
9 small 6.25 1 2 3
10 small 6.25 1 2 3
It is unclear how many columns you want to treat as grouping variable. If the number is small, @tauft's answer is sufficient. Otherwise, we can use across
with group_by
so that we can use <tidy-select>
to select the columns to group.
library(dplyr)
data2 <- data %>%
group_by(across(-a)) %>%
summarise(a = mean(a), .groups = "drop") %>%
relocate(e, a, .before = b)
data2
# # A tibble: 3 x 5
# e a b c d
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 larg 4 1 2 3
# 2 med 6 1 2 3
# 3 small 6.25 1 2 3
The above can also written as follows.
data2 <- data %>%
group_by(across(b:e)) %>%
summarise(a = mean(a), .groups = "drop") %>%
relocate(e, a, .before = b)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.