简体繁体中英

Why do we select Entropy Gain as criteria in Decision Tree learning instead of decrease in error rates as a criteria?

原文 2017-07-31 11:30:30 7 1 machine-learning/ artificial-intelligence/ decision-tree

I've been following the ML course by Tom Mitchel and in Decision Tree (DT) Learning, the Entropy Gain is chosen as ruling criterion for the choice of a feature/parameter x_i as child of another feature in DT top-down growth.

Always our goal of selecting a DT is to avoid overfitting by minimizing the error rates; then why don't we use error rate as a ruling criteria for feature/parameter selection in top-down growth of the tree.

Feature vector for Input data: X = < x_1, x_2......x_n >

1 answers

You can't use error rate as you don't know what it will be in the end. That is, imagine that the tree is eventually of depth 10, and you are on level 2 of the tree, deciding which feature and which threshold to choose. You can't know at this stage what would be the error rate at level 10. So your criteria should only be based on the current level. With that being said, you don't have to use information gain. There are other criterias as well. For example, one such cretiria is Gini impurity, and this is the default cretiria that is used in scikit learn DecisionTreeClassifier .

Calculating entropy in decision tree (Machine learning)

The termination criteria when building decision tree

Why use cross entropy in decision tree rather than 0/1 loss

How to select top n features using Information Gain as criteria

How do I create a gain chart in R for a decision tree model?

Decision tree entropy calculation target

Information gain for decision tree in Weka

Decision Tree Learning and Impurity

Machine learning, decision tree

Decision Tree - Can the Entropy of a Node be Zero?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Calculating entropy in decision tree (Machine learning) The termination criteria when building decision tree Why use cross entropy in decision tree rather than 0/1 loss How to select top n features using Information Gain as criteria How do I create a gain chart in R for a decision tree model? Decision tree entropy calculation target Information gain for decision tree in Weka Decision Tree Learning and Impurity Machine learning, decision tree Decision Tree - Can the Entropy of a Node be Zero?

Related Tags

Why do we select Entropy Gain as criteria in Decision Tree learning instead of decrease in error rates as a criteria?

Question

1 answers

solution1 0 2017-07-31 11:52:38

solution1
0 2017-07-31 11:52:38