简体   繁体   English

如何使sklearn.ensemble.RandomForestRegressor不照顾杂质减少启发式

[英]how to make sklearn.ensemble.RandomForestRegressor not take care of impurity decrease heuristic

I am using RandomForestRegressor of sklearn to implement Random Forest Imputation. 我正在使用sklearn的RandomForestRegressor来实现随机森林插补。 Sklearn allows us to set parameter min_impurity_decrease to specify the heuristic of split stopping criteria. Sklearn允许我们设置参数min_impurity_decrease来指定拆分停止条件的启发式。 For example, if min_impurity_decrease = 0.0 , and if a node split results in a worse impurity, then the node will be made a leaf node. 例如,如果min_impurity_decrease = 0.0 ,并且如果节点拆分导致更差的杂质,则该节点将成为叶节点。

The problem is that, I prefer Random Forest to be fully grown without early stopping or pruned. 问题是,我希望Random Forest能够完全生长,而不会尽早停止或修剪。 But min_impurity_decrease has to be set as a non-negative float. 但是必须将min_impurity_decrease设置为非负浮点数。 Is there any solution to this situation? 有什么解决办法吗?

Intuitively, I have tried to set min_impurity_decrease = float("-inf") , which results in error message. 直观地,我试图设置min_impurity_decrease = float("-inf") ,这会导致错误消息。

You apparently have to modify sklearn code. 您显然必须修改sklearn代码。 Take a look at this answer on how to install sklearn in editable mode. 看一下有关如何以可编辑模式安装sklearn的答案 Be sure to create new virtual environment so as to not mess up original sklearn files. 确保创建新的虚拟环境,以免弄乱原始的sklearn文件。

Good news is you don't have to change any Cython code. 好消息是您不必更改任何Cython代码。 Go to file sklearn/tree/tree.py . 转到文件sklearn/tree/tree.py A check for the value of min_impurity_decrease only seems to be present in BaseDecisionTree class. 仅在BaseDecisionTree类中检查min_impurity_decrease的值。 According to Github, in 306 line there is a code snippet: 根据Github的说法,在306行中有一个代码段:

if self.min_impurity_decrease < 0.:
        raise ValueError("min_impurity_decrease must be greater than "
                         "or equal to 0")

Simply delete this and reload the library. 只需删除它并重新加载库即可。 I couldn't test this solution, so let me know if you run into some problem. 我无法测试此解决方案,所以如果您遇到任何问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM