I'm trying to build a new dataset for analysis where I need to remove all the non-zero important features from the original dataset.
My dataset shape is (61176, 13047) after preproceesing.
I have found the features importance of all the features as below:
clf_features = DecisionTreeClassifier(min_samples_split=2,class_weight = 'balanced')
clf_features.fit(x_trn_tfidf, y_train)
I got the feature importance for all the features in numpy array.
Now I need to remove all the non-zero important features (ie for eg values less than 0.001) and create a new dataset.
Can someone suggest how to do this?
Try this:
x_trn_tfidf[:,clf_features.feature_importance_ >= 0.001]
Note: This will return all the features whose importance value is either greater than or equal to 0.001.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.