Python：计算 TP、FP、FN 和 TN

Question

I have dataframe with true class and class, that were predicted by some algorithm.我有真正的类和类的数据框，这是由某种算法预测的。

     true  pred
0       1     0
1       1     1
2       1     1
3       0     0
4       1     1

I try to use我尝试使用

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i in range(len(y_hat)): 
        if y_actual[i] == y_hat[i] == 1:
           TP += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == 1 and y_actual != y_hat[i]:
           FP += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == y_hat[i] == 0:
           TN += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == 0 and y_actual != y_hat[i]:
           FN += 1

    return(TP, FP, TN, FN)

but it return me但它回报我

ValueError: The truth value of a Series is ambiguous. ValueError：系列的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all().使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。 How can I fix that or maybe there are the better decision?我该如何解决这个问题，或者有更好的决定？

Answer 1

The error message happens because Python tries to convert an array to a boolean and fails.出现错误消息是因为 Python 尝试将数组转换为布尔值并失败。

That's because you're comparing y_actual with y_hat[i] .那是因为您将y_actual与y_hat[i]进行比较。

It should be y_actual[i] != y_hat[i] (2 times in the code)它应该是y_actual[i] != y_hat[i] （代码中的 2 次）

(I realize that it's just a typo, but the message is cryptic enough for the problem to become interesting) （我意识到这只是一个错字，但该消息足够神秘，使问题变得有趣）

While we're at it, you could make a more efficient routine by merging all your counters in a sole loop and using enumerate to avoid at least one access by index:在此期间，您可以通过将所有计数器合并在一个循环中并使用 enumerate 避免至少一次按索引访问来制定更高效的例程：

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i,yh in enumerate(y_hat): 
        if y_actual[i] == yh == 1:
           TP += 1
        if y_actual[i] == 1 and y_actual[i] != yh:
           FP += 1
        if y_actual[i] == yh == 0:
           TN += 1
        if y_actual[i] == 0 and y_actual[i] != yh:
           FN += 1

    return(TP, FP, TN, FN)

you see that this way it can be even be simplified even more , cutting a lot through tests and branches:你会看到，通过这种方式，它甚至可以被进一步简化，通过测试和分支进行大量削减：

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i,yh in enumerate(y_hat): 
        if y_actual[i] == yh:
           if yh == 1:
               TP += 1
           elif yh == 0:
               TN += 1
        else: # y_actual[i] != yh
           if y_actual[i] == 1 and :
              FP += 1
           elif y_actual[i] == 0:
              FN += 1

    return(TP, FP, TN, FN)

Answer 2

我使用sklearn.metrics confusion_matrix sklearn.metrics ，它返回我需要的矩阵。

Python：计算 TP、FP、FN 和 TN

问题描述

2 个解决方案

解决方案1
1 2016-09-27 20:44:41

解决方案2
1 2016-09-27 21:01:51

Python：计算 TP、FP、FN 和 TN

问题描述

2 个解决方案

解决方案1 1 2016-09-27 20:44:41

解决方案2 1 2016-09-27 21:01:51

解决方案1
1 2016-09-27 20:44:41

解决方案2
1 2016-09-27 21:01:51