简体   繁体   中英

How to remove all non zero important features using feature_importances_ in Decision Tree?

I'm trying to build a new dataset for analysis where I need to remove all the non-zero important features from the original dataset.

My dataset shape is (61176, 13047) after preproceesing.

I have found the features importance of all the features as below:

clf_features = DecisionTreeClassifier(min_samples_split=2,class_weight = 'balanced')
clf_features.fit(x_trn_tfidf, y_train)

I got the feature importance for all the features in numpy array.

Now I need to remove all the non-zero important features (ie for eg values less than 0.001) and create a new dataset.

Can someone suggest how to do this?

Try this:

x_trn_tfidf[:,clf_features.feature_importance_ >= 0.001]

Note: This will return all the features whose importance value is either greater than or equal to 0.001.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM