简体   繁体   English

为决策树创建我自己的标准函数

[英]Create my own criteria function for decision tree

I'm using the sklearn DecisionTreeClassifier and I would like to create my own criteria function (by default you can use gini or entropy, but it's not what i'm looking for).我正在使用 sklearn DecisionTreeClassifier 并且我想创建我自己的标准函数(默认情况下您可以使用 gini 或 entropy,但这不是我想要的)。 Something like that :类似的东西:

clf = DecisionTreeClassifier( criterion = 'my_function')

Is it possible to do that ?有可能这样做吗?

Is there a similar algorithm that allows to do it (in Python or R ) ?是否有类似的算法允许这样做(在 Python 或 R 中)?

Thanks.谢谢。

For R, you can use the rpart package .对于 R,您可以使用rpart In particular, see the User Written Split Functions vignette.特别是,请参阅用户编写的拆分函数小插图。 Despite having limited decision tree experience, I was able to follow the examples to handle a multivariate output using a custom algorithm.尽管决策树经验有限,我还是能够按照示例使用自定义算法处理多变量输出。

However, note that the built-in classifiers use a fast external library, so if you write your algorithm in plain R, the processing may be considerably slower.但是,请注意内置分类器使用快速外部库,因此如果您用普通 R 编写算法,处理速度可能会慢得多。 As the vignette notes, cross-validation is disabled by default due to this expected slowdown.正如小插图所指出的,由于这种预期的放缓,默认情况下禁用交叉验证。 If your data is "small" enough (or you can wait a bit longer), this may not be an issue in your case.如果您的数据足够“小”(或者您可以等待更长的时间),那么这对您来说可能不是问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM