我不明白为什么我不能在 GridSearchCV 中打印具有不同参数的所有分数。 代码: 输出: 这很棒,但我想用给定的参数打印所有不同的分数(与最好的进行比较)。 下面是我尝试过的,它缺少许多不同的参数组合与相应的分数。 代码: 输出: 所需的输出将包括具有不同内核的np.lo ...
提示:本站收集StackOverFlow近2千万问答,支持中英文搜索,鼠标放在语句上弹窗显示对应的参考中文或英文, 本站还提供 中文繁体 英文版本 中英对照 版本,有任何建议请联系yoyou2525@163.com。
我正在尝试在执行网格搜索时查看gridsearchcv的自定义评分功能中当前正在使用的参数。 理想情况下,它看起来像:
编辑 :澄清一下,我正在寻找使用网格搜索中的参数,因此我需要能够在函数中访问它们。
def fit(X, y):
grid = {'max_features':[0.8,'sqrt'],
'subsample':[1, 0.7],
'min_samples_split' : [2, 3],
'min_samples_leaf' : [1, 3],
'learning_rate' : [0.01, 0.1],
'max_depth' : [3, 8, 15],
'n_estimators' : [10, 20, 50]}
clf = GradientBoostingClassifier()
score_func = make_scorer(make_custom_score, needs_proba=True)
model = GridSearchCV(estimator=clf,
param_grid=grid,
scoring=score_func,
cv=5)
def make_custom_score(y_true, y_score):
'''
y_true: array-like, shape = [n_samples] Ground truth (true relevance labels).
y_score : array-like, shape = [n_samples] Predicted scores
'''
print(parameters_used_in_current_gridsearch)
…
return score
我知道执行完成后就可以获取参数,但是我正在尝试在代码执行时获取参数。
不知道这是否满足您的用例,但是有一个verbose
参数可用于此类事情:
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
estimator = SGDRegressor()
gscv = GridSearchCV(estimator, {
'alpha': [0.001, 0.0001], 'average': [True, False],
'shuffle': [True, False], 'max_iter': [5], 'tol': [None]
}, cv=3, verbose=2)
gscv.fit([[1,1,1],[2,2,2],[3,3,3]], [1, 2, 3])
这将打印到以下内容到stdout
:
Fitting 3 folds for each of 8 candidates, totalling 24 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None ...
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None ...
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None ...
[CV] alpha=0.001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None ..
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None ..
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None ..
[CV] alpha=0.001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None ..
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None .
[CV] alpha=0.0001, average=True, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None .
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None .
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None .
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=True, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None
[CV] alpha=0.0001, average=False, max_iter=5, shuffle=False, tol=None, total= 0.0s
[Parallel(n_jobs=1)]: Done 24 out of 24 | elapsed: 0.0s finished
您可以参考文档,但也可以为更高的详细程度指定更高的值。
如果需要在网格搜索步骤之间实际执行某项操作,则需要使用一些较低级的Scikit学习功能编写自己的例程。
GridSearchCV
内部使用ParameterGrid
类,您可以对其进行迭代以获得参数值的组合。
基本循环看起来像这样
import sklearn
from sklearn.model_selection import ParameterGrid, KFold
clf = GradientBoostingClassifier()
grid = {
'max_features': [0.8,'sqrt'],
'subsample': [1, 0.7],
'min_samples_split': [2, 3],
'min_samples_leaf': [1, 3],
'learning_rate': [0.01, 0.1],
'max_depth': [3, 8, 15],
'n_estimators': [10, 20, 50]
}
scorer = make_scorer(make_custom_score, needs_proba=True)
sampler = ParameterGrid(grid)
cv = KFold(5)
for params in sampler:
for ix_train, ix_test in cv.split(X, y):
clf_fitted = clone(clf).fit(X[ix_train], y[ix_train])
score = scorer(clf_fitted, X[ix_test], y[ix_test])
# do something with the results
而不是使用的make_scorer()
在你的"custom score"
,你可以让自己的scorer
(注意之间的差异score
和scorer
!),它接受三个参数与签名(estimator, X_test, y_test)
有关更多详细信息,请参见文档 。
在此功能中,您可以访问在网格搜索中已针对训练数据进行训练的estimator
对象。 然后,您可以轻松访问该估算器的所有参数。 但是请确保返回浮点值作为得分。
就像是:
def make_custom_scorer(estimator, X_test, y_test):
'''
estimator: scikit-learn estimator, fitted on train data
X_test: array-like, shape = [n_samples, n_features] Data for prediction
y_test: array-like, shape = [n_samples] Ground truth (true relevance labels).
y_score : array-like, shape = [n_samples] Predicted scores
'''
# Here all_params is a dict of all the parameters in use
all_params = estimator.get_params()
# You need to do some filtering to get the parameters you want,
# but that should be easy I guess (just specify the key you want)
parameters_used_in_current_gridsearch = {k:v for k,v in all_params.items()
if k in ['max_features', 'subsample', ..., 'n_estimators']}
print(parameters_used_in_current_gridsearch)
y_score = estimator.predict(X_test)
# Use whichever metric you want here
score = scoring_function(y_test, y_score)
return score
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.