简体繁体 English

如何使用精度和召回等指标评估 Pytorch model？

[英]How to evaluate Pytorch model using metrics like precision and recall?

原文 2020-07-08 18:31:31 0 1 python/ tensorflow/ scikit-learn/ metrics/ tensor

I have trained a simple Pytorch neural network on some data, and now wish to test and evaluate it using metrics like accuracy, recall, f1 and precision.我已经在一些数据上训练了一个简单的 Pytorch 神经网络，现在希望使用准确度、召回率、f1 和精度等指标对其进行测试和评估。 I searched the Pytorch documentation thoroughly and could not find any classes or functions for these metrics.我彻底搜索了 Pytorch 文档，找不到这些指标的任何类或函数。 I then tried converting the predicted labels and the actual labels to numpy arrays and using scikit-learn's metrics, but the predicted labels don't seem to be either 0 or 1 (my labels), but instead continuous values.然后我尝试将预测标签和实际标签转换为 numpy arrays 并使用 scikit-learn 的指标，但预测标签似乎不是 0 或 1（我的标签），而是连续值。 Because of this scikit-learn metrics don't work.因此，scikit-learn 指标不起作用。 Fast.ai documentation didn't make much sense either, I could not understand which class to inherit for precision etc (although I was able to calculate accuracy). Fast.ai 文档也没有多大意义，我无法理解要继承哪个 class 以实现精度等（尽管我能够计算精度）。 Any help would be much desperately appreciated.任何帮助将不胜感激。

1 个解决方案

Usually, in a binary classification setting, your neural network will output the probability that the event occurs (eg, if you are using sigmoid activation and a single neuron at the output layer), which is a continuous value between 0 and 1. To evaluate precision and recall of your model (eg, with scikit-learn's precision_score and recall_score ), it is required that you convert the probability of your model into binary value.通常，在二进制分类设置中，您的神经网络将 output 事件发生的概率（例如，如果您在 output 层使用 sigmoid 激活和单个神经元），这是一个介于 0 和 1 之间的连续值。 model 的精度和召回率（例如，使用 scikit-learn 的precision_score和recall_score ），需要将model 的概率转换为二进制值。 This is achieved by specifying a threshold value for your model's probability.这是通过为模型的概率指定阈值来实现的。 (For a overview about threshold, please take a look at this reference: https://developers.google.com/machine-learning/crash-course/classification/thresholding ) （有关阈值的概述，请查看此参考： https://developers.google.com/machine-learning/crash-course/classification/thresholding ）

Scikit-learn's precision_recall_curve ( https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_curve.html ) is commonly used to understand how precision and recall metrics behave for different probability thresholds. Scikit-learn 的precision_recall_curve ( https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_curve.html ) 通常用于了解精度和召回指标在不同概率阈值下的表现。 By analysing the precision and recall values per threshold, you will be able to specify the best threshold for your problem (you may want higher precision, so you will aim for higher thresholds, eg, 90%; or you may want to have a balanced precision and recall, and you will need to check the threshold that returns the best f1 score for your problem).通过分析每个阈值的精度和召回值，您将能够为您的问题指定最佳阈值（您可能想要更高的精度，因此您将瞄准更高的阈值，例如 90%；或者您可能想要平衡精度和召回率，您需要检查为您的问题返回最佳 f1 分数的阈值）。 A good overview on the topic may be found in the following reference: https://machinelearningmastery.com/threshold-moving-for-imbalanced-classification/可以在以下参考资料中找到有关该主题的良好概述： https://machinelearningmastery.com/threshold-moving-for-imbalanced-classification/