简体繁体中英

Can training set be used to determine variable importance using randomForest in R although the prediction of testing set is quite low?

原文 2019-03-12 19:05:42 5 1 r/ random-forest/ training-data

I am using randomForest in R, I have a training model with R^2 of 0.94 , however , the prediction capacity for testing data is quite low. I would like to know if I can still use this training model only for determining which variable is more important/effective for output prediction.

Thanks

1 answers

Based on what little information you provide, the question is hard to answer (think about providing more detail and background). Low prediction quality can result from wrong algorithm tuning, or it can be inherent in the data, ie your predictors themselves are not very strongly related to the outcome. In the first case, the prediction could be better with different parameters, eg more or less trees, different values for mtry, etc. If this is the case, then your importance measures are just as biased as your prediction (and should be used with caution). If the predictors themselves are weak, that means that your low quality prediction is as good as it gets. In this case, I would say the importance measures can be used, but they only tell you which of your overall weak predictors are more or less weak.

Variable importance plot using randomforest package in R

Function and looping in training and testing set using r

R random forest - training set using target column for prediction

running randomForest loop and variable importance in R

Split into training and testing set in R?

R randomForest importance

SVM Is working on Training set but not on testing set in R

Randomly split data by criterion into training and testing data set using R

understanding per class variable importance in 'randomForest' R package

Training and Testing set with createFolds function in R

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Variable importance plot using randomforest package in R Function and looping in training and testing set using r R random forest - training set using target column for prediction running randomForest loop and variable importance in R Split into training and testing set in R? R randomForest importance SVM Is working on Training set but not on testing set in R Randomly split data by criterion into training and testing data set using R understanding per class variable importance in 'randomForest' R package Training and Testing set with createFolds function in R

Related Tags

Can training set be used to determine variable importance using randomForest in R although the prediction of testing set is quite low?

Question

1 answers

solution1 0 2019-03-19 06:23:58

solution1
0 2019-03-19 06:23:58