简体   繁体   中英

Python classification define feature importance

I am wondering if it is possbile to define feature importances/weights in Pyhton Classification methods? For example:

model = tree.DecisionTreeClassifier(feature_weight = ...) 

I've seen in RandomForest there is an attribute feature_importance, which shows the importance of features based on analysis. But is it possible that I could define the feature importance for analysis in advance?

Thank you very much for your help in advance!

The feature importance determination in random forest classifiers uses a random forest-specific method (invert all binary tests over the feature, and get the additional classification error).

Feature importance is thus a concept that relates to the predictive ability of the model, not the training phase. Now, if you want to make it so that your model favours some feature over others, you will have to find some trick that depends on the model.

Regarding sklearn 's DecisionTreeClassifier , such a trick does not appear to be trivial. You could custom your class weights, if you know some classes will be more easily predicted by some features that you want to favour; but this seems pretty dirty.

In other types of models, such as ones using kernels, you can do this more easily, by setting hyperparameters which directly relate to features.

If you are trying to limit an overfitting, I would also simply suggest that you remove the features you know to be less important.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM