简体   繁体   中英

Gathering multiple dummy variables as one categorical variable in R

I'm aware of this solution , but I am having difficulty applying it with data that isn't only dummy variables.

Some sample code to load, essentially from a series of expenses

df <- data.frame(Charge = c(12,4,6,10,5,9), Groceries = c(1,0,0,0,0,0),Utilities = c(0,1,0,0,0,0),Consumables = c(0,0,1,0,0,0), Transportation = c(0,0,0,1,0,0),Entertainment = c(0,0,0,0,1,0),Misc = c(0,0,0,0,0,1))

I would like to create a new variable "Category" that takes the column names that are currently coded as binaries. I am able to do this with ifelse , but I am looking for a more general solution, eg out of the reshape package.

Currently, I can only solve this with:

df$Category <- ifelse(df$Groceries==1, "Groceries",      
                      ifelse(df$Utilities==1,"Utilities",
                             ifelse(df$Consumables==1,"Consumables",
                                    ifelse(df$Transportation==1,"Transportation",
                                           ifelse(df$Entertainment==1,"Entertainment","Misc")))))

If there is always a 1 and it is not repeated in a single row, then use max.col to return the index of the max value in the row and with that index, subset the names of the dataset

df$Category <- names(df)[-1][max.col(df[-1])]
df$Category
#[1] "Groceries"      "Utilities"      "Consumables"    "Transportation" "Entertainment"  "Misc"  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM