简体   繁体   English

如何在决策树中获取所有基尼指数?

[英]How do I get all Gini indices in my decision tree?

I have made a decision tree using sklearn, here, under the SciKit learn DL package, viz. 我使用sklearn创建了一个决策树,在这里,在SciKit学习DL包,即。 sklearn.tree.DecisionTreeClassifier().fit(x,y) . sklearn.tree.DecisionTreeClassifier().fit(x,y)

How do I get the gini indices for all possible nodes at each step? 如何在每个步骤获取所有可能节点的gini索引? graphviz only gives me the gini index of the node with the lowest gini index, ie the node used for split. graphviz只给出了具有最低gini索引的节点的gini索引,即用于拆分的节点。

For example, the image below (from graphviz ) tells me the gini score of the Pclass_lowVMid right index which is 0.408, but not the gini index of the Pclass_lower or Sex_male at that step. 例如,下面的图片(来自graphviz )告诉我Pclass_lowVMid权利索引的基尼评分为0.408,但不是该步骤中Pclass_lower或Sex_male的基尼指数。 I just know the Gini index of Pclass_lower and Sex_male must be greater than (0.408*0.7 + 0) but that's it. 我只知道Pclass_lower的Gini指数和Sex_male必须大于(0.408 * 0.7 + 0),但就是这样。

决策树

pclass node的gini索引=左节点的gini索引*(左节点的样本数/左节点的样本数+右节点的样本数)+右节点的gini索引*(左边的样本数)节点/没有左边节点的样本+右边节点的样本数量)所以这里它将是

Gini index of pclass = 0 + .408 *(7/10) = 0.2856

Using export_graphviz shows impurity for all nodes, at least in version 0.20.1 . 使用export_graphviz显示所有节点的杂质,至少在版本0.20.1

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from graphviz import Source

data = load_iris()
X, y = data.data, data.target

clf = DecisionTreeClassifier(max_depth=2, random_state=42)
clf.fit(X, y)

graph = Source(export_graphviz(clf, out_file=None, feature_names=data.feature_names))
graph.format = 'png'
graph.render('dt', view=True);

在此输入图像描述

The impurity values for all nodes are also accessible in the impurity attribute of the tree . 所有节点的杂质值也可以在treeimpurity属性中访问。

clf.tree_.impurity
array([0.66666667, 0.        , 0.5       , 0.16803841, 0.04253308])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将决策树分类器转换为手动过程? - How Do I Turn My Decision Tree Classifier Into A Manual Process? 如何在python(pandas)数据框中获得一列以查看导致我得到结果的决策树的所有规则? - How can I get one column in my python (pandas) Dataframe to see all rules of my decision tree that led me to my result? Graphviz 决策树输出不显示标准/基尼系数 - Graphviz Decision Tree Output Not Displaying Criterion/Gini 如何在Scikit-Learn的决策树算法中修改分裂标准(基尼/熵)? - How to amend the splitting criteria (gini/entropy) in a decision tree algorithm in Scikit-Learn? 如何在sklearn中获得基尼系数 - How can I get Gini Coefficient in sklearn 如何获得具有预处理和分类步骤的决策树管道的特征重要性? - How do I get feature importances for decision tree pipeline that has preprocessing and classification steps? 如何解释 knn 和决策树的结果? - How can I interpret my result of knn and of decision tree? "使用 sklearn,我如何找到决策树的深度?" - Using sklearn, how do I find depth of a decision tree? 我的决策树实现太慢了。 我怎样才能让它更快,还是我做错了? - My Decision Tree implementation is too slow. How can I make it faster or am I doing it all wrong? 我正在尝试估计决策树的准确性,为什么会出现 TypeError? - I'm trying to get an estimate for the accuracy of a decision tree, Why do i get a TypeError?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM