简体   繁体   中英

Calculating TP, FP, TN, FN values

I am trying to build a very simple program for calculating TP/FP/FN/TN for 2 strings (predicted secondary protein structure vs proven secondary protein structure), but it does not calculating them correctly. What is it that I am missing?

actual_str = '*ΟΟΟΟΟΟ******////////////**//////////*****////ΟΟΟΟΟΟΟΟΟ***'
predicted_str = '****--********/////////-----//////****----**-ΟΟΟΟΟΟΟ/-****'

TP = 0
FP = 0
TN = 0
FN = 0

for i in range(len(predicted_str)): 
    if predicted_str[i]==actual_str[i]=='O':
        TP += 1
        
    if predicted_str[i]!='O' and actual_str[i]=='O': 
        FP += 1
        
    if predicted_str[i]==actual_str[i]=='/' or predicted_str[i]==actual_str[i]=='*':
        TN += 1
        
    if predicted_str[i]=='O' and actual_str[i]!='O':
        FN += 1
        
    if predicted_str[i]=='-': #just ignore the '-' and move on to the next
        i+=1

print(TP, FP, TN, FN)
    

Output: 0 0 26 0

This is a strange one, but try to copy one of the 'O' character used in the actual_str or predicted_str variables, and paste that in your if-statements. I think there is a mismatch, even though they look identical.

Also the last if-statement is not necessary.

As commented before, the characters you are using are different, it mixes the greek letter O omicron and the Latin o capital.

https://apps.timwhitlock.info/unicode/inspect?s=%CE%9F

In addition instead of compare it by index, it make sense to use a zip operator in this usecase:

for (actual, predicted) in zip(actual_str, predicted_str):
   if (..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM