简体   繁体   中英

Find row maximum across columns by using vector of column names in dplyr

I have a long list of column names in a character vector that refer to various medications. I like to keep that list at the top of my code to make it easy to edit and easy to reference the group of medications at various points in my script. I would like to take the row maximum across the medications using dplyr by feeding it the pre-defined vector of column names to find the maximum across. It seems like there is a simple fix but it is escaping me today...

I tried the code below but it returns one of the names in the list of column names.

I also tried various permutations using get(), select() and do.call() to try and make R read the character vector differently but I couldn't figure it out...

data(mtcars)

colnames <- c("vs", "am", "gear", "carb")

df <- mtcars %>%
  rowwise() %>%
  mutate(max = max(colnames))

EDIT: I'd like the maximum to be shown in a new column. For example, I'd like the output as the following:

vs am gear carb MAX
0  1   4    4    4
0  1   4    4    4
1  1   4    1    4
1  0   3    1    3
0  0   3    2    3

you can summarise a select number of columns or a vector of columns such as you have, using summarise_at from dplyr :

data(mtcars)

colnames <- c("vs", "am", "gear", "carb")

df <- mtcars %>%
  summarise_at(colnames, list(max))

  vs am gear carb
1  1  1    5    8

You simply specify the columns first, and then function second; in this case max . It's the same syntax for select_at , mutate_at and rename_at - you use summarise_at because you preserve the specified columns rather than create new ones.

You could also tidy the data by making it long first then finding the max and joining it on the original data. Note you would have to use gather_() here with all names in quotes so you can reference your vector. In this example I am using car as your drug and did not address if there is a tie for max value.

library(dplyr)
library(tidyr)
colnames <- c("vs", "am", "gear", "carb")

df <- mtcars %>%
      mutate(nms = row.names(mtcars)) 
#transpose then find max value and keep max value
dfx <-  tidyr::gather_(df, 'nms2','vals', colnames) %>% 
        group_by(nms) %>% 
        mutate(max = max(vals)) %>% 
        ungroup %>% 
        filter(max == vals)
#join back on to data with column name and max value 
mt2 <- left_join(df,select(dfx, nms, vals,nms2),by='nms')

using pmax and much less code

you can use pmax inside a do.call to the the rowwise maximum

df <- mtcars %>% 
      mutate(mx2 = do.call(pmax,mtcars[,colnames]))

It may not be the most dplyr answer, but you could always use apply inside mutate :

mtcars %>%
  mutate(max_val = apply(., 1, function(x) max(x[col_names]))) %>%
  head()

   mpg cyl disp  hp drat    wt  qsec vs am gear carb max_val2 max_val
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4        4       4
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4        4       4
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1        4       4
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1        3       3
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2        3       3
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1        3       3

Or, you could do something like this:

mtcars$max_val2 <- mtcars %>%
  select(col_names) %>%
  transmute(apply(., 1, max)) %>%
  pull()
head(mtcars)

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb max_val2
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4        4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4        4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1        4
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1        3
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2        3
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1        3

Using c_across with your initial attempt appears to work:

mycols <- c("vs", "am", "gear", "carb")

df <- mtcars %>% 
  rowwise() %>%
  mutate(MAX = max(c_across(all_of(mycols)))) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM