Tensorflow 中多类分类的分类精度和召回率？

Question

Is there a way to get per class precision or recall when doing multiclass classification using tensor flow.在使用张量流进行多类分类时，有没有办法获得每类的精度或召回率。

For example, If I have y_true and y_pred from each batch, is there a functional way to get precision or recall per class if I have more than 2 classes.例如，如果我有每个批次的 y_true 和 y_pred，如果我有 2 个以上的类，是否有一种功能性方法来获得每个类的精度或召回率。

Answer 1

Here's a solution that is working for me for a problem with n=6 classes.这是一个对我有用的解决方案，用于解决 n=6 类的问题。 If you have many more classes this solution is probably slow and you should use some sort of mapping instead of a loop.如果你有更多的类，这个解决方案可能很慢，你应该使用某种映射而不是循环。

Assume you have one hot encoded class labels in rows of tensor labels and logits (or posteriors) in tensor labels .假设你有张行一个热编码的等级标签， labels和张量logits（或后验） labels 。 Then, if n is the number of classes, try this:然后，如果n是类的数量，试试这个：

y_true = tf.argmax(labels, 1)
y_pred = tf.argmax(logits, 1)

recall = [0] * n
update_op_rec = [[]] * n

for k in range(n):
    recall[k], update_op_rec[k] = tf.metrics.recall(
        labels=tf.equal(y_true, k),
        predictions=tf.equal(y_pred, k)
    )

Note that inside tf.metrics.recall , the variables labels and predictions are set to boolean vectors like in the 2 variable case, which allows the use of the function.请注意，在tf.metrics.recall ，变量labels和predictions被设置为布尔向量，就像在 2 变量情况下一样，这允许使用该函数。

Answer 2

2 facts: 2个事实：

As stated in other answers, Tensorflow built-in metricsprecision and recall don't support multi-class (the doc says will be cast to bool )正如其他答案中所述，Tensorflow 内置指标精度和召回率不支持多类（文档说will be cast to bool ）
There are ways of getting one-versus-all scores by using precision_at_k by specifying the class_id , or by simply casting your labels and predictions to tf.bool in the right way.有通过使用获得一个抗所有得分方式precision_at_k通过指定class_id ，或者通过简单的铸造你的labels ，并predictions到tf.bool以正确的方式。

Because this is unsatisfying and incomplete, I wrote tf_metrics , a simple package for multi-class metrics that you can find on github .因为这令人不满意tf_metrics完整，所以我编写了tf_metrics ，这是一个用于多类度量的简单包，您可以在github 上找到。 It supports multiple averaging methods like scikit-learn .它支持多种平均方法，如scikit-learn 。

Example示例

import tensorflow as tf
import tf_metrics

y_true = [0, 1, 0, 0, 0, 2, 3, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 2, 0, 3, 3, 1]
pos_indices = [1]        # Metrics for class 1 -- or
pos_indices = [1, 2, 3]  # Average metrics, 0 is the 'negative' class
num_classes = 4
average = 'micro'

# Tuple of (value, update_op)
precision = tf_metrics.precision(
    y_true, y_pred, num_classes, pos_indices, average=average)
recall = tf_metrics.recall(
    y_true, y_pred, num_classes, pos_indices, average=average)
f2 = tf_metrics.fbeta(
    y_true, y_pred, num_classes, pos_indices, average=average, beta=2)
f1 = tf_metrics.f1(
    y_true, y_pred, num_classes, pos_indices, average=average)

Answer 3

I believe you cannot do multiclass precision, recall, f1 with the tf.metrics.precision/recall functions.我相信你不能用tf.metrics.precision/recall函数做多类精度、召回、f1。 You can use sklearn like this for a 3 class scenario:您可以像这样将 sklearn 用于 3 类场景：

from sklearn.metrics import precision_recall_fscore_support as score

prediction = [1,2,3,2] 
y_original = [1,2,3,3]

precision, recall, f1, _ = score(y_original, prediction)

print('precision: {}'.format(precision))
print('recall: {}'.format(recall))
print('fscore: {}'.format(f1))

This will print an array of precision, recall values but format it as you like.这将打印一个精度数组，召回值，但可以根据需要对其进行格式化。

Answer 4

I have been puzzled by this problem for quite a long time.我被这个问题困扰了很长时间。 I know this problem can be solved by sklearn, but I really want to solve this by Tensorflow's API.我知道这个问题可以通过 sklearn 来解决，但我真的很想通过 Tensorflow 的 API 来解决这个问题。 And by reading its code, I finally figure out how this API works.通过阅读它的代码，我终于弄清楚了这个 API 是如何工作的。

tf.metrics.precision_at_k(labels, predictions, k, class_id)

Firstly, let's assume this is a 4 classes problem.首先，让我们假设这是一个4 类问题。
Secondly, we have two samples which their labels are 3 and 1 and their predictions are [0.5,0.3,0.1,0.1], [0.5,0.3,0.1,0.1] .According to our predictions, we can get the result that the two samples has been predicted as 1,1 .其次，我们有两个样本，它们的标签是 3 和 1 ，它们的预测是 [0.5,0.3,0.1,0.1], [0.5,0.3,0.1,0.1] 。根据我们的预测，我们可以得到两个结果样本已预测为1,1 。
Thirdly, if you want to get the precision of class 1 , use the formula TP/(TP+FP) , and we assume the result is 1/(1+1)=0.5 .第三，如果你想得到class 1的精度，使用公式TP/(TP+FP) ，我们假设结果是1/(1+1)=0.5 。 Because the two samples both have been predicted as 1 , but one of the them is actually 3 , so the TP is 1 , the FP is 1 , and the result is 0.5 .因为两个样本都被预测为1 ，但其中一个实际上是3 ，所以TP为1 ， FP为1 ，结果为0.5 。

Finally, let's use this API to verify our assumption.最后，让我们使用这个 API 来验证我们的假设。

 import tensorflow as tf labels = tf.constant([[2],[0]],tf.int64) predictions = tf.constant([[0.5,0.3,0.1,0.1],[0.5,0.3,0.1,0.1]]) metric = tf.metrics.precision_at_k(labels, predictions, 1, class_id=0) sess = tf.Session() sess.run(tf.local_variables_initializer()) precision, update = sess.run(metric) print(precision) # 0.5

NOTICE通知

k isn't the number of classes. k不是类的数量。 It represents the number of what we want to sort, which means the last dimension of predictions must match the value of k.它表示我们要排序的数量，这意味着预测的最后一个维度必须与 k 的值匹配。
class_id represents the Class for which we want binary metrics. class_id表示我们想要二进制度量的类。
If k=1, means that we won't sort the predictions, because what we want to do is actually a binary classificaion, but referring to different classes.如果k=1，意味着我们不会对预测进行排序，因为我们想要做的实际上是一个二元分类，而是指不同的类。 So if we sort the predictions, the class_id will be confused and the result will be wrong.所以如果我们对预测进行排序， class_id 就会混淆，结果就会出错。
And one more important thing is that if we want to get the right result, the input of label should minus 1 because the class_id actually represents the index of the label , and the subscript of label starts with 0 .还有更重要的一点是，如果我们想要得到正确的结果， label的输入应该是负1，因为class_id实际上代表的是label的索引，而label的下标是从0开始的。

Answer 5

There is a way to do this in TensorFlow.在 TensorFlow 中有一种方法可以做到这一点。

tf.metrics.precision_at_k(labels, predictions, k, class_id)

set k = 1 and set corresponding class_id.设置 k = 1 并设置相应的 class_id。 For example class_id=0 to calculate the precision of first class.例如 class_id=0 计算第一类的精度。

Answer 6

I believe TF does not provide such functionality yet.我相信 TF 还没有提供这样的功能。 As per the docs (https://www.tensorflow.org/api_docs/python/tf/metrics/precision ), it says both the labels and predictions will be cast to bool, and so it relates only to binary classification.根据文档（https://www.tensorflow.org/api_docs/python/tf/metrics/precision ），它说标签和预测都将转换为 bool，因此它仅与二进制分类有关。 Perhaps it's possible to one-hot encode the examples and it would work?也许可以对示例进行单热编码并且它会起作用？ But not sure about this.但不确定这一点。

Answer 7

Here's a complete example from predicting in Tensorflow to reporting via scikit-learn:这是从 Tensorflow 中的预测到通过 scikit-learn 报告的完整示例：

import tensorflow as tf
from sklearn.metrics import classification_report

# given trained model `model` and test vector `X_test` gives `y_test`
# where `y_test` and `y_predicted` are integers, who labels are indexed in 
# `labels`
y_predicted = tf.argmax(model.predict(X_test), axis=1)

# Confusion matrix
cf = tf.math.confusion_matrix(y_test, y_predicted)
plt.matshow(cf, cmap='magma')
plt.colorbar()
plt.xticks(np.arange(len(labels)), labels=labels, rotation=90)
plt.yticks(np.arange(len(labels)), labels=labels)
plt.clim(0, None)

# Report
print(classification_report(y_test, y_predicted, target_names=labels))

Tensorflow 中多类分类的分类精度和召回率？

问题描述

7 个解决方案

解决方案1
6 2018-01-25 22:15:09

解决方案2
4 2018-09-16 22:00:11

解决方案3
3 2018-01-13 00:50:18

解决方案4
3 2018-08-02 08:06:44

解决方案5
2 2018-06-06 14:33:54

解决方案6
1 2017-11-14 14:50:35

解决方案7
0 2020-05-22 02:00:43

Tensorflow 中多类分类的分类精度和召回率？

问题描述

7 个解决方案

解决方案1 6 2018-01-25 22:15:09

解决方案2 4 2018-09-16 22:00:11

解决方案3 3 2018-01-13 00:50:18

解决方案4 3 2018-08-02 08:06:44

解决方案5 2 2018-06-06 14:33:54

解决方案6 1 2017-11-14 14:50:35

解决方案7 0 2020-05-22 02:00:43

解决方案1
6 2018-01-25 22:15:09

解决方案2
4 2018-09-16 22:00:11

解决方案3
3 2018-01-13 00:50:18

解决方案4
3 2018-08-02 08:06:44

解决方案5
2 2018-06-06 14:33:54

解决方案6
1 2017-11-14 14:50:35

解决方案7
0 2020-05-22 02:00:43