简体   繁体   中英

In R, caret package RFE function selects more features than allowed in size

I have a simple code that uses rfe to perform feature selection on different time periods of my data. I use the following rfeControl and rfe function calls:

control <- rfeControl(functions=rfFuncs, method="cv", number=10)
results <- rfe(feature_selection_data
               , feature_selection_target$value
               , sizes = c(1:12)
               , rfeControl = control)

Each time this runs I insert the values into a list:

include <- predictors(results)
include_list[[row]] <- include

Somehow, although I set size to a maximum of 12, in 2 out of my 20 time periods, the feature selection results in 65 features (which is the total number of features in the initial dataset).

I am new to using this function, I do not know what I'm doing wrong here, any help is appreciated!

Thank you!

If you look at the description of the RFE algorithm ( http://topepo.github.io/caret/recursive-feature-elimination.html ), you'll see that it is necessary to include all features in the first iteration.

Your next question will probably be how to then select the suboptimal models that have less features. One answer can be found here (although it's not too helpful): Access all models produced by rfe in caret

I would suggest adjusting the ranking function to allow feature sets that are not optimal in terms of error, but that are smaller (see: http://topepo.github.io/caret/recursive-feature-elimination.html#the-selectsize-function ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM