简体繁体 English

为什么我们在决策树学习中选择熵增益作为标准，而不是将错误率降低作为标准？

[英]Why do we select Entropy Gain as criteria in Decision Tree learning instead of decrease in error rates as a criteria?

原文 2017-07-31 11:30:30 9 1 machine-learning/ artificial-intelligence/ decision-tree

I've been following the ML course by Tom Mitchel and in Decision Tree (DT) Learning, the Entropy Gain is chosen as ruling criterion for the choice of a feature/parameter x_i as child of another feature in DT top-down growth. 我一直在学习Tom Mitchel的ML课程，在“决策树（DT）学习”中，选择熵增益作为选择特征/参数x_i作为DT自上而下增长的另一个特征的子项的裁定标准。

Always our goal of selecting a DT is to avoid overfitting by minimizing the error rates; 我们选择DT的目标始终是通过最小化错误率来避免过度拟合 。 then why don't we use error rate as a ruling criteria for feature/parameter selection in top-down growth of the tree. 那么为什么不使用错误率作为树的自顶向下生长中的特征/参数选择的准则。

Feature vector for Input data: X = < x_1, x_2......x_n > 输入数据的特征向量： X = < x_1, x_2......x_n >

1 个解决方案

You can't use error rate as you don't know what it will be in the end. 您无法使用错误率，因为您不知道最终的结果。 That is, imagine that the tree is eventually of depth 10, and you are on level 2 of the tree, deciding which feature and which threshold to choose. 也就是说，假设树最终的深度为10，而您位于树的2级，决定选择哪个特征和哪个阈值。 You can't know at this stage what would be the error rate at level 10. So your criteria should only be based on the current level. 您目前不知道第10级的错误率是多少。因此您的条件应该仅基于当前级别。 With that being said, you don't have to use information gain. 话虽如此，您不必使用信息获取。 There are other criterias as well. 还有其他标准。 For example, one such cretiria is Gini impurity, and this is the default cretiria that is used in scikit learn DecisionTreeClassifier . 例如，这样的一个cretiria是Gini杂质，这是scikit learning DecisionTreeClassifier中使用的默认cretiria。