简体   繁体   中英

Dplyr group_by and summarise, but keep non numeric variables

I have a dataset in a long format, where I add up values for different group. Some variables are factor variables and should be kept in the result.

mtcars$model <- as.factor(rownames(mtcars))
longmtcars <- rbind(mtcars, mtcars, mtcars)

longmtcars$vs <- ifelse(longmtcars$vs == 1, "Yes", "No")

result <- longmtcars %>%
    group_by(factor(model)) %>%
    summarise_if(is.numeric, sum)
result

# A tibble: 32 x 11
   `factor(model)`      mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
   <fct>              <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 AMC Javelin         45.6    24  912    450  9.45 10.3   51.9     0     9     6
 2 Cadillac Fleetwood  31.2    24 1416    615  8.79 15.8   53.9     0     9    12
 3 Camaro Z28          39.9    24 1050    735 11.2  11.5   46.2     0     9    12
 4 Chrysler Imperial   44.1    24 1320    690  9.69 16.0   52.3     0     9    12
 5 Datsun 710          68.4    12  324    279 11.6   6.96  55.8     3    12     3

My current, non scaleable solution

#ugly solution

vsvar <- longmtcars[1:32, "vs"]
result <- cbind(result, vsvar)
result

         factor(model)   mpg cyl   disp   hp  drat     wt  qsec am gear carb vsvar
1          AMC Javelin  45.6  24  912.0  450  9.45 10.305 51.90  0    9    6    No
2   Cadillac Fleetwood  31.2  24 1416.0  615  8.79 15.750 53.94  0    9   12    No
3           Camaro Z28  39.9  24 1050.0  735 11.19 11.520 46.23  0    9   12   Yes

This is correct, but really ugly and I will use it in a Shiny App, which will cause trouble, so doing it the current way is no option. Is there in all-in-one solution? It may also be done with data.table, but I am not too familiar with it.

You could add that (those) variable(s) to the group_by clause:

result <- longmtcars %>%
  mutate_if(is.character, factor) %>%
  group_by(model, vs) %>%
  summarise_if(is.numeric, sum)

result
#> # A tibble: 32 x 12
#> # Groups:   model [32]
#>    model              vs      mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
#>    <fct>              <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 AMC Javelin        No     45.6    24  912    450  9.45 10.3   51.9     0     9     6
#>  2 Cadillac Fleetwood No     31.2    24 1416    615  8.79 15.8   53.9     0     9    12
#>  3 Camaro Z28         No     39.9    24 1050    735 11.2  11.5   46.2     0     9    12

In base R you could use aggregate .

result <- with(longmtcars, 
     aggregate(as.matrix(longmtcars[sapply(longmtcars, is.numeric)]) ~ model + vs, 
               longmtcars, sum))
head(result)
#                model vs  mpg cyl disp  hp  drat     wt  qsec am gear carb
# 1        AMC Javelin No 45.6  24  912 450  9.45 10.305 51.90  0    9    6
# 2 Cadillac Fleetwood No 31.2  24 1416 615  8.79 15.750 53.94  0    9   12
# 3         Camaro Z28 No 39.9  24 1050 735 11.19 11.520 46.23  0    9   12
# 4  Chrysler Imperial No 44.1  24 1320 690  9.69 16.035 52.26  0    9   12
# 5   Dodge Challenger No 46.5  24  954 450  8.28 10.560 50.61  0    9    6
# 6         Duster 360 No 42.9  24 1080 735  9.63 10.710 47.52  0    9   12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM