简体繁体中英

How is feature importance calculated for GradientBoostingClassifier

原文 2017-05-24 16:06:44 3 1 python/ machine-learning/ scikit-learn/ feature-selection

I'm using scikit-learn's gradient-boosted trees classifier, GradientBoostingClassifier . It makes feature importance score available in feature_importances_ . How are these feature importances calculated?

I'd like to understand what algorithm scikit-learn is using, to help me understand how to interpret those numbers. The algorithm isn't listed in the documentation.

1 answers

This is documented elsewhere in the scikit-learn documentation. In particular, here is how it works:

For each tree, we calculate the feature importance of a feature F as the fraction of samples that will traverse a node that splits based on feature F (see here ). Then, we average those numbers across all trees (as described here ).

It is not described exactly how scikit-learn estimates the fraction of nodes that will traverse a tree node that splits on feature F.

The interpretation: scores will be in the range [0,1]. Higher scores mean the feature is more important. This is an array with shape (n_features,) whose values are positive and sum to 1.0

How is the feature score(/importance) in the XGBoost package calculated?

How Feature Importance is calculated in sklearn's RandomForest?

Zero Importance Feature Removal Without Refit in SciKit-Learn GradientBoostingClassifier

How to get feature importance in RF

How to find feature importance in data?

How to get feature importance in xgboost?

How to find 'feature importance' or variable importance graph for KNNClassifier()

How to get feature importance in Decision Tree?

How to get feature importance in xgboost by 'information gain'?

How to get feature Importance in naive bayes?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How is the feature score(/importance) in the XGBoost package calculated? How Feature Importance is calculated in sklearn's RandomForest? Zero Importance Feature Removal Without Refit in SciKit-Learn GradientBoostingClassifier How to get feature importance in RF How to find feature importance in data? How to get feature importance in xgboost? How to find 'feature importance' or variable importance graph for KNNClassifier() How to get feature importance in Decision Tree? How to get feature importance in xgboost by 'information gain'? How to get feature Importance in naive bayes?

Related Tags

How is feature importance calculated for GradientBoostingClassifier

Question

1 answers

solution1 9 2017-05-24 16:26:15

solution1
9 2017-05-24 16:26:15