简体   繁体   English

Python:计算 TP、FP、FN 和 TN

[英]Python: count TP, FP, FN и TN

I have dataframe with true class and class, that were predicted by some algorithm.我有真正的类和类的数据框,这是由某种算法预测的。

     true  pred
0       1     0
1       1     1
2       1     1
3       0     0
4       1     1

I try to use我尝试使用

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i in range(len(y_hat)): 
        if y_actual[i] == y_hat[i] == 1:
           TP += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == 1 and y_actual != y_hat[i]:
           FP += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == y_hat[i] == 0:
           TN += 1
    for i in range(len(y_hat)): 
        if y_actual[i] == 0 and y_actual != y_hat[i]:
           FN += 1

    return(TP, FP, TN, FN)

but it return me但它回报我

ValueError: The truth value of a Series is ambiguous. ValueError:系列的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all().使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。 How can I fix that or maybe there are the better decision?我该如何解决这个问题,或者有更好的决定?

The error message happens because Python tries to convert an array to a boolean and fails.出现错误消息是因为 Python 尝试将数组转换为布尔值并失败。

That's because you're comparing y_actual with y_hat[i] .那是因为您将y_actualy_hat[i]进行比较。

It should be y_actual[i] != y_hat[i] (2 times in the code)它应该是y_actual[i] != y_hat[i] (代码中的 2 次)

(I realize that it's just a typo, but the message is cryptic enough for the problem to become interesting) (我意识到这只是一个错字,但该消息足够神秘,使问题变得有趣)

While we're at it, you could make a more efficient routine by merging all your counters in a sole loop and using enumerate to avoid at least one access by index:在此期间,您可以通过将所有计数器合并在一个循环中并使用 enumerate 避免至少一次按索引访问来制定更高效的例程:

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i,yh in enumerate(y_hat): 
        if y_actual[i] == yh == 1:
           TP += 1
        if y_actual[i] == 1 and y_actual[i] != yh:
           FP += 1
        if y_actual[i] == yh == 0:
           TN += 1
        if y_actual[i] == 0 and y_actual[i] != yh:
           FN += 1

    return(TP, FP, TN, FN)

you see that this way it can be even be simplified even more , cutting a lot through tests and branches:你会看到,通过这种方式,它甚至可以被进一步简化,通过测试和分支进行大量削减:

def classification(y_actual, y_hat):
    TP = 0
    FP = 0
    TN = 0
    FN = 0

    for i,yh in enumerate(y_hat): 
        if y_actual[i] == yh:
           if yh == 1:
               TP += 1
           elif yh == 0:
               TN += 1
        else: # y_actual[i] != yh
           if y_actual[i] == 1 and :
              FP += 1
           elif y_actual[i] == 0:
              FN += 1

    return(TP, FP, TN, FN)

我使用sklearn.metrics confusion_matrix sklearn.metrics ,它返回我需要的矩阵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM