简体   繁体   中英

Assigning factors to columns in matrix in R

I am trying to use the missForest package to impute missing data into a fairly large dataset.

missForest takes data in the form of aa data matrix with missing values. The columns correspond to the variables and the rows to the observations. Therefore, I converted my dataframe to a matrix, which inadvertently turned all of my categorical variables to numeric type.

Does anyone know how to assign a column of a matrix as a factor??

Thank you so much!!!

Ok let me add more details.

I have a data frame that has the following columns.

 homt_sub<-homt[c("CASEID","REASON","PSYPROB","SUB2.2","FREQ1","FRSTUSE1","FREQ2","AGEcont","GENDER", "RACE2", "ARRESTS")]

The only continuous variable is AGEcont. The rest are factors. I had to make a matrix to use the missForest function.

homt_matrix<-data.matrix(homt_sub, rownames.force = NA)
homt_sub.imp <- missForest(homt_matrix, verbose= TRUE, maxiter = 3, ntree = 20)

I can extract the imputations from here but I get decimal values because they were treated as continuous variables.

model.matrix could be a solution but it seems a bit burdensome to create so many extra variables, get the imputed data and then collapse it back to one column for each variable later? I know there is a way to run randomForest with factor variables, but it's very unclear how to do it.

Thank you so much

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM