简体   繁体   中英

Variable lengths differ with random forest

I'm really new to R and I want to make a random forest. However I keep getting the same error-

Error in model.frame.default, lengths of variables differ.

I know this issue has been solved in another topic by constructing a formula from strings with as. formula but I have really no idea how to do it. Can you help me please? Thank you.

#A vector that has random sample of training values (70% & 30% samples)
index = sample(2,nrow(df), replace = TRUE, prob=c(0.7,0.3)) 

#Training Date 
training = df[index==1,]

#Testing data
testing = df[index==2,]

#Random forest model 
RFM = randomForest(df$Rating~., df$Customer_type, data = training)

Well what your error is, is that your independent variable is Rating from the df dataframe, but you selected data = training . This means that your random forest should take data from 2 different dataframes, which isn't possible. I guess that randomForest(Rating ~ Customer_type, data = training) would work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM