简体   繁体   中英

How to get attribute list from fitted model in Scikit-learn?

Is there any way to get a list of features (attributes) from used model in Scikit-learn (or whole table of used training data)? I am using some preprocessing like feature selection and I would like to know features that were selected and features that were removed. For example I use Random Forest Classifier and Recursive Feature Elimination.

A mask of selected features is stored in the '_support' attribute of the RFE object.

See the doc here: http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html#sklearn.feature_selection.RFE

Here is an example:

from sklearn.datasets import make_friedman1
from sklearn.feature_selection import RFE
from sklearn.svm import SVR

# load a dataset
X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)

estimator = SVR(kernel="linear")
selector = RFE(estimator, 5, step=1)
X_new = selector.fit_transform(X, y)

print selector.support_ 
print selector.ranking_

Will display:

array([ True,  True,  True,  True,  True,
      False, False, False, False, False], dtype=bool)
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5]) 

Note that if you want to use a random forest classifier in a RFE model, you'll get this error:

AttributeError: 'RandomForestClassifier' object has no attribute 'coef_'

I found a workarround in this thread: Recursive feature elimination on Random Forest using scikit-learn

You have to override the RandomForestClassifier class like this:

class RandomForestClassifierWithCoef(RandomForestClassifier):
    def fit(self, *args, **kwargs):
        super(RandomForestClassifierWithCoef, self).fit(*args, **kwargs)
        self.coef_ = self.feature_importances_

Hope it helps :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM