简体   繁体   中英

Merge Several Dummy Variable Columns by name

my question is exactly as follows

library(caret)
data(cars)
head(cars)
colnames(cars)

Expected Answer

cars$type <- names(cars[14:18])[max.col(cars[14:18])] 

but using the name of the columns such as below which is not working. how to get around with this? many thanks in advance.

cars$type <- cars[c("convertible", "coupe", "hatchback", "sedan", "wagon" )][apply(cars[c("convertible", "coupe", "hatchback", "sedan", "wagon" )], 1, match, x = 1)] 

head(cars)

You can subset dataframe by name in the similar fashion -

cols <- c("convertible", "coupe", "hatchback", "sedan", "wagon" )
cars$type <- cols[max.col(cars[cols])] 

To check the output -

identical(cols[max.col(cars[cols])] , names(cars[14:18])[max.col(cars[14:18])])
#[1] TRUE

It is better to specify the ties.method or else it can choose "random" as default and this could change the outcome in each run where there are multiple max values per row

cols <- c("convertible", "coupe", "hatchback", "sedan", "wagon" )
cars$type <- cols[max.col(cars[cols], "first")]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM