scikit-learn GridSearchCV如何計算best_score_？

Question

我一直在試圖弄清楚GridSearchCV的best_score_參數是如何計算的（換句話說，它是什么意思）。 文件說：

左側數據的best_estimator得分。

所以，我試着將它翻譯成我理解的東西並計算實際“y”的r2_score和每個kfold的預測ys - 並得到不同的結果（使用這段代碼）：

test_pred = np.zeros(y.shape) * np.nan 
for train_ind, test_ind in kfold:
    clf.best_estimator_.fit(X[train_ind, :], y[train_ind])
    test_pred[test_ind] = clf.best_estimator_.predict(X[test_ind])
r2_test = r2_score(y, test_pred)

我到處尋找有關best_score_的更有意義的解釋，但找不到任何東西。 有人願意解釋嗎？

謝謝

Answer 1

這是最佳估算器的平均交叉驗證分數。 讓我們制作一些數據並修復交叉驗證的數據划分。

>>> y = linspace(-5, 5, 200)
>>> X = (y + np.random.randn(200)).reshape(-1, 1)
>>> threefold = list(KFold(len(y)))

現在運行cross_val_score和GridSearchCV ，兩者都有這些固定的折疊。

>>> cross_val_score(LinearRegression(), X, y, cv=threefold)
array([-0.86060164,  0.2035956 , -0.81309259])
>>> gs = GridSearchCV(LinearRegression(), {}, cv=threefold, verbose=3).fit(X, y) 
Fitting 3 folds for each of 1 candidates, totalling 3 fits
[CV]  ................................................................
[CV] ...................................... , score=-0.860602 -   0.0s
[Parallel(n_jobs=1)]: Done   1 jobs       | elapsed:    0.0s
[CV]  ................................................................
[CV] ....................................... , score=0.203596 -   0.0s
[CV]  ................................................................
[CV] ...................................... , score=-0.813093 -   0.0s
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.0s finished

注意GridSearchCV輸出中score=-0.860602 ， score=0.203596 ， score=-0.813093 ; 確切地說是cross_val_score返回的值。

請注意，“平均值”實際上是折疊的宏觀平均值。 GridSearchCV的iid參數可用於獲取樣本的微觀平均值。

scikit-learn GridSearchCV如何計算best_score_？

問題描述

1 個解決方案

解決方案1
9 已采納 2014-06-07 10:36:40

scikit-learn GridSearchCV如何計算best_score_？

問題描述

1 個解決方案

解決方案1 9 已采納 2014-06-07 10:36:40

解決方案1
9 已采納 2014-06-07 10:36:40