Sci-kit：使用 GridSearchCV 时获得估计器混淆矩阵的最简单方法是什么？

Question

In this simplified example, I've trained a learner with GridSearchCV.在这个简化的例子中，我用 GridSearchCV 训练了一个学习器。 I would like to return the confusion matrix of the best learner when predicting on the full set X.在对全集 X 进行预测时，我想返回最佳学习器的混淆矩阵。

lr_pipeline = Pipeline([('clf', LogisticRegression())])
lr_parameters = {}

lr_gs = GridSearchCV(lr_pipeline, lr_parameters, n_jobs=-1)
lr_gs = lr_gs.fit(X,y)

print lr_gs.confusion_matrix # Would like to be able to do this

Thanks谢谢

Answer 1

You will first need to predict using best estimator in your GridSerarchCV .您首先需要在GridSerarchCV使用最佳估计器进行预测。 A common method to use is GridSearchCV.decision_function() , But for your example, decision_function returns class probabilities from LogisticRegression and does not work with confusion_matrix .一个常用的方法是GridSearchCV.decision_function() ，但是对于你的例子， decision_function从LogisticRegression返回类概率并且不适用于confusion_matrix 。 Instead, find best estimator using lr_gs and predict the labels using that estimator.相反，使用lr_gs找到最佳估计器并使用该估计器预测标签。

y_pred = lr_gs.best_estimator_.predict(X)

Finally, use sklearn's confusion_matrix on real and predicted y最后，在真实和预测的y上使用 sklearn 的confusion_matrix矩阵

from sklearn.metrics import confusion_matrix
print confusion_matrix(y, y_pred)

Answer 2

I found this question while searching for how to calculate the confusion matrix while fitting Sci-kit Learn's GridSearchCV .我在寻找如何在拟合 Sci-kit Learn 的GridSearchCV时计算混淆矩阵时发现了这个问题。 I was able to find a solution by defining a custom scoring function, although it's somewhat kludgy.我能够通过定义自定义评分函数找到解决方案，尽管它有点笨拙。 I'm leaving this answer for anyone else who makes a similar search.我将这个答案留给其他进行类似搜索的人。

As mentioned by @MLgeek and @bugo99iot, the accepted answer by @Sudeep Juvekar isn't really satisfactory.正如@MLgeek 和@bugo99iot 所提到的，@Sudeep Juvekar 接受的答案并不令人满意。 It offers a literal answer to original question as asked, but it's not usually the case that a machine learning practitioner would be interested in the confusion matrix of a fitted model on its training data.它提供了原始问题的字面答案，但机器学习从业者通常不会对其训练数据上的拟合模型的混淆矩阵感兴趣。 It is more typically of interest to know how well a model generalizes to data it hasn't seen.通常更感兴趣的是了解模型对它没有见过的数据的泛化程度。

To use a custom scoring function in GridSearchCV you will need to import the Scikit-learn helper function make_scorer .要在GridSearchCV使用自定义评分函数，您需要导入 Scikit-learn 辅助函数make_scorer 。

from sklearn.metrics import make_scorer

The custom scoring function looks like this自定义评分函数如下所示

def _count_score(y_true, y_pred, label1=0, label2=1):
    return sum((y == label1 and pred == label2)
                for y, pred in zip(y_true, y_pred))

For a given pair of labels, (label1, label2) , it calculates the number of examples where the true value of y is label1 and the predicted value of y is label2 .对于给定的一组标签， (label1, label2)它计算的实例的数目，其中的真值y是label1和的预测值y是label2 。

To start, find all of the labels in the training data首先，找到训练数据中的所有标签

all_labels = sorted(set(y))

The optional argument scoring of GridSearchCV can receive a dictionary mapping strings to scorers. GridSearchCV的可选参数scoring可以接收一个字典，将字符串映射到scoring者。 make_scorer can take a scoring function along with bindings for some of its parameters and produce a scorer, which is a particular type of callable that is used for scoring in GridSearchCV , cross_val_score , etc. Let's build up this dictionary for each pair of labels. make_scorer可以采用评分函数及其一些参数的绑定并生成评分器，这是一种特殊类型的可调用对象，用于在GridSearchCV 、 cross_val_score等中评分。让我们为每对标签构建这个字典。

scorer = {}
for label1 in all_labels:
    for label2 in all_labels:
        count_score = make_scorer(_count_score, label1=label1,
                                  label2=label2)
        scorer['count_%s_%s' % (label1, label2)] = count_score

You'll also want to add any additional scoring functions you're interested in. To avoid getting into the subtleties of scoring for multi-class classification let's add a simple accuracy score.您还需要添加您感兴趣的任何其他评分函数。为了避免陷入多类分类评分的微妙之处，让我们添加一个简单的准确度分数。

# import placed here for the sake of demonstration.
# Should be imported alongside make_scorer above
from sklearn.metrics import accuracy_score

scorer['accuracy'] = make_scorer(accuracy_score)

We can now fit GridSearchCV我们现在可以拟合GridSearchCV

num_splits = 5
lr_gs = GridSearchCV(lr_pipeline, lr_parameters, n_jobs=-1,
                     scoring=scorer, refit='accuracy',
                     cv=num_splits)

refit='accuracy' tells GridSearchCV that it should judge by best accuracy score to decide on the parameters to use when refitting. refit='accuracy'告诉GridSearchCV它应该通过最佳准确度分数来判断重新拟合时要使用的参数。 In the case where you are passing a dictionary of multiple scorers to scoring , if you do not pass a value to the optional argument refit , GridSearchCV will not refit the model on all training data.在您将多个scoring者的字典传递给 score 的情况下，如果您没有将值传递给可选参数refit ， GridSearchCV将不会在所有训练数据上重新拟合模型。 We've explicitly set the number of splits because we'll need to know this later.我们已经明确设置了拆分的数量，因为我们稍后需要知道这一点。

Now, for each of the training folds used in cross-validation, essentially what we've done is calculate the confusion matrix on the respective test folds.现在，对于交叉验证中使用的每个训练折叠，基本上我们所做的是计算相应测试折叠上的混淆矩阵。 The test folds do not overlap and cover the entire space of data, we've therefore made predictions for each data point in X in such a way that the prediction for each point does not depend on the associated target label for that point.测试折叠不重叠并覆盖整个数据空间，因此我们对X中的每个数据点进行了预测，使得每个点的预测不依赖于该点的相关目标标签。

We can add up the confusion matrices associated to the test folds to get something useful that gives information on how well the model generalizes.我们可以将与测试折叠相关联的混淆矩阵相加，以获得一些有用的信息，这些信息提供有关模型泛化程度的信息。 It can also be interesting to look at the confusion matrices for the test folds separately and do stuff like calculate variances.单独查看测试折叠的混淆矩阵并执行诸如计算方差之类的操作也很有趣。

We're not done yet though.我们还没有完成。 We need to actually pull out the confusion matrix for the best estimator.我们需要实际提取最佳估计量的混淆矩阵。 In this example, the cross validation results will be stored in the dictionary lr_gs.cv_results .在此示例中，交叉验证结果将存储在字典lr_gs.cv_results 。 First let's get the index in the results corresponding to the best set of parameters首先让我们得到与最佳参数集相对应的结果中的索引

best_index = lr_gs.cv_results['rank_test_accuracy'] - 1

If you are using a different metric to decide upon the best parameters, substitute for 'accuracy' the key you are using for the associated scorer in the scoring dictionary passed to GridSearchCV .如果您使用不同的指标来决定最佳参数，请在传递给GridSearchCV的评分字典中替换您用于关联评分器的“准确度”键。

In my own application I chose to store the confusion matrix as a nested dictionary.在我自己的应用程序中，我选择将混淆矩阵存储为嵌套字典。

confusion = defaultdict(lambda: defaultdict(int))
for label1 in all_labels:
    for label2 in all_labels
        for i in range(num_splits):
            key = 'split%s_test_count_%s_%s' % (i, label1, label2)
            val = int(lr_gs.cv_results[key][best_index])
            confusion[label1][label2] += val
confusion = {key: dict(value) for key, value in confusion.items()}

There's some stuff to unpack here.这里有一些东西要拆包。 defaultdict(lambda: defaultdict(int)) constructs a nested defaultdict ; defaultdict(lambda: defaultdict(int))构造一个嵌套的defaultdict ； a defaultdict of defaultdict of int (if you're copying and pasting, don't forget to add from collections import defaultdict at the top of your file). int defaultdict的defaultdict （如果您要复制和粘贴，请不要忘记在文件顶部添加from collections import defaultdict ）。 The last line of this snippet is used to turn confusion into a regular dict of dict of int .此代码段的最后一行用于将confusion转换为int的dict的常规dict 。 Never leave defaultdict s lying around when they are no longer needed.当不再需要defaultdict时，切勿留下它们。

You will likely want to store your confusion matrix in a different way.您可能希望以不同的方式存储混淆矩阵。 The key fact is that the confusion matrix entry for the pair of labels 'label1' , 'label2' for test fold i is stored in关键事实是，测试折叠i的标签对'label1'和'label2'的混淆矩阵条目存储在

lr_gs.cv_results['spliti_label1_label2'][best_index]

See here for an example of this confusion matrix calculation used in practice.有关实践中使用的这种混淆矩阵计算的示例，请参见此处。 I think it's a bit of a code smell to rely on the specific format of the keys in the cv_results dictionary but this does work, at least as of the day of this post.我认为依赖cv_results字典中键的特定格式有点代码味道，但这确实有效，至少在这篇文章的当天是这样。

Sci-kit：使用 GridSearchCV 时获得估计器混淆矩阵的最简单方法是什么？

问题描述

2 个解决方案

解决方案1
10 已采纳 2016-03-22 21:52:44

解决方案2
2 2020-12-16 03:35:52

Sci-kit：使用 GridSearchCV 时获得估计器混淆矩阵的最简单方法是什么？

问题描述

2 个解决方案

解决方案1 10 已采纳 2016-03-22 21:52:44

解决方案2 2 2020-12-16 03:35:52

解决方案1
10 已采纳 2016-03-22 21:52:44

解决方案2
2 2020-12-16 03:35:52