简体   繁体   中英

Is there any way to manually modify the thresholds set in the decision tree learnt from a given dataset?

I was trying to create a decision tree model using scikit-learn's module: tree . Once I generated the model, I visualized the tree and the criteria based on which the decisions were made. However, I wish to modify the thresholds in some criteria manually to see how the output changes for the same. Is there any method to do so? Or any library that converts the decision tree into a bunch of if-else statements once it has learned the required thresholds from the dataset and vice-versa?

I know that the thresholds chosen by the module are based on some impurity metrics like Gini-impurity, information gain, etc. However, I still would like to experiment with those threshold values.

Thanks!

Yes, you can easily do this.

A sklearn Decision Tree exposes its underlying tree through the tree_ attribute. This tree_ , among other things, have an attribute threshold , which is a numpy array containing threshold values of all nodes. You can modify this array, thereby changing the thresholds.

For example:

X,y = load_breast_cancer(return_X_y=True)
dt = DecisionTreeClassifier(max_depth=3).fit(X,y)
print(dt.tree_.threshold)     #All the thresholds, size equals "dt.tree_.node_count"
dt.tree_.threshold[3] = 10.0  #Manually modifying a threshold    

To verify, If you compare accuracy on a seperate test set before and after this modification (assuming you've modified a non-leaf node), you should notice a change (which is likely to be worse).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM