简体   繁体   English

Scikit-learn 在 DecisionTreeClassifier 上使用 GridSearchCV

[英]Scikit-learn using GridSearchCV on DecisionTreeClassifier

I tried to use GridSearchCV on DecisionTreeClassifier, but get the following error: TypeError: unbound method get_params() must be called with DecisionTreeClassifier instance as first argument (got nothing instead)我尝试在 DecisionTreeClassifier 上使用 GridSearchCV,但得到以下错误:TypeError: unbound method get_params() must be called with DecisionTreeClassifier instance as first argument ( got nothing instead)

here's my code:这是我的代码:

from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.model_selection import GridSearchCV
from sklearn.cross_validation import  cross_val_score

X, Y = createDataSet(filename)
tree_para = {'criterion':['gini','entropy'],'max_depth':[4,5,6,7,8,9,10,11,12,15,20,30,40,50,70,90,120,150]}
clf = GridSearchCV(DecisionTreeClassifier, tree_para, cv=5)
clf.fit(X, Y)

In your call to GridSearchCV method, the first argument should be an instantiated object of the DecisionTreeClassifier instead of the name of the class.在调用GridSearchCV方法时,第一个参数应该是DecisionTreeClassifier的实例化对象,而不是类的名称。 It should be应该是

clf = GridSearchCV(DecisionTreeClassifier(), tree_para, cv=5)

Check out the example here for more details.查看此处的示例以获取更多详细信息。

Hope that helps!希望有帮助!

Another aspect regarding the parameters is that grid search can be run with different combination of parameters.关于参数的另一方面是网格搜索可以使用不同的参数组合运行。 The parameters mentioned below would check for different combinations of criterion with max_depth下面提到的参数将检查criterionmax_depth不同组合

tree_param = {'criterion':['gini','entropy'],'max_depth':[4,5,6,7,8,9,10,11,12,15,20,30,40,50,70,90,120,150]}

If needed, the grid search can be run over multiple set of parameter candidates:如果需要,可以在多组参数候选上运行网格搜索:

For example:例如:

tree_param = [{'criterion': ['entropy', 'gini'], 'max_depth': max_depth_range},
              {'min_samples_leaf': min_samples_leaf_range}]

In this case, grid search would be run over two sets of parameters, first with every combination of criterion and max_depth and second, only for all provided values of min_samples_leaf在这种情况下,网格搜索将在两组参数上运行,首先是criterionmax_depth每个组合,其次,仅针对所有提供的min_samples_leaf

需要在分类器前加一个():

clf = GridSearchCV(DecisionTreeClassifier(), tree_para, cv=5)

Here is the code for decision tree Grid Search这是决策树网格搜索的代码

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV

def dtree_grid_search(X,y,nfolds):
    #create a dictionary of all values we want to test
    param_grid = { 'criterion':['gini','entropy'],'max_depth': np.arange(3, 15)}
    # decision tree model
    dtree_model=DecisionTreeClassifier()
    #use gridsearch to test all values
    dtree_gscv = GridSearchCV(dtree_model, param_grid, cv=nfolds)
    #fit model to data
    dtree_gscv.fit(X, y)
    return dtree_gscv.best_params_

If the problem is still there try to replace :如果问题仍然存在,请尝试更换:

from sklearn.grid_search import GridSearchCV

with

from sklearn.model_selection import GridSearchCV

It sounds stupid but I had similar problems and I managed to solve them using this tip.这听起来很愚蠢,但我遇到了类似的问题,我设法使用这个技巧解决了这些问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM