I have a long list of column names in a character vector that refer to various medications. I like to keep that list at the top of my code to make it easy to edit and easy to reference the group of medications at various points in my script. I would like to take the row maximum across the medications using dplyr by feeding it the pre-defined vector of column names to find the maximum across. It seems like there is a simple fix but it is escaping me today...
I tried the code below but it returns one of the names in the list of column names.
I also tried various permutations using get(), select() and do.call() to try and make R read the character vector differently but I couldn't figure it out...
data(mtcars)
colnames <- c("vs", "am", "gear", "carb")
df <- mtcars %>%
rowwise() %>%
mutate(max = max(colnames))
EDIT: I'd like the maximum to be shown in a new column. For example, I'd like the output as the following:
vs am gear carb MAX
0 1 4 4 4
0 1 4 4 4
1 1 4 1 4
1 0 3 1 3
0 0 3 2 3
you can summarise a select number of columns or a vector of columns such as you have, using summarise_at
from dplyr
:
data(mtcars)
colnames <- c("vs", "am", "gear", "carb")
df <- mtcars %>%
summarise_at(colnames, list(max))
vs am gear carb
1 1 1 5 8
You simply specify the columns first, and then function second; in this case max
. It's the same syntax for select_at
, mutate_at
and rename_at
- you use summarise_at
because you preserve the specified columns rather than create new ones.
You could also tidy the data by making it long first then finding the max and joining it on the original data. Note you would have to use gather_()
here with all names in quotes so you can reference your vector. In this example I am using car as your drug and did not address if there is a tie for max value.
library(dplyr)
library(tidyr)
colnames <- c("vs", "am", "gear", "carb")
df <- mtcars %>%
mutate(nms = row.names(mtcars))
#transpose then find max value and keep max value
dfx <- tidyr::gather_(df, 'nms2','vals', colnames) %>%
group_by(nms) %>%
mutate(max = max(vals)) %>%
ungroup %>%
filter(max == vals)
#join back on to data with column name and max value
mt2 <- left_join(df,select(dfx, nms, vals,nms2),by='nms')
you can use pmax inside a do.call
to the the rowwise maximum
df <- mtcars %>%
mutate(mx2 = do.call(pmax,mtcars[,colnames]))
It may not be the most dplyr
answer, but you could always use apply
inside mutate
:
mtcars %>%
mutate(max_val = apply(., 1, function(x) max(x[col_names]))) %>%
head()
mpg cyl disp hp drat wt qsec vs am gear carb max_val2 max_val
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4 4
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 4 4
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4 4
4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3 3
5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3 3
6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3 3
Or, you could do something like this:
mtcars$max_val2 <- mtcars %>%
select(col_names) %>%
transmute(apply(., 1, max)) %>%
pull()
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb max_val2
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3
Using c_across
with your initial attempt appears to work:
mycols <- c("vs", "am", "gear", "carb")
df <- mtcars %>%
rowwise() %>%
mutate(MAX = max(c_across(all_of(mycols))))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.