Calculating TP, FP, TN, FN values

Question

I am trying to build a very simple program for calculating TP/FP/FN/TN for 2 strings (predicted secondary protein structure vs proven secondary protein structure), but it does not calculating them correctly. What is it that I am missing?

actual_str = '*ΟΟΟΟΟΟ******////////////**//////////*****////ΟΟΟΟΟΟΟΟΟ***'
predicted_str = '****--********/////////-----//////****----**-ΟΟΟΟΟΟΟ/-****'

TP = 0
FP = 0
TN = 0
FN = 0

for i in range(len(predicted_str)): 
    if predicted_str[i]==actual_str[i]=='O':
        TP += 1
        
    if predicted_str[i]!='O' and actual_str[i]=='O': 
        FP += 1
        
    if predicted_str[i]==actual_str[i]=='/' or predicted_str[i]==actual_str[i]=='*':
        TN += 1
        
    if predicted_str[i]=='O' and actual_str[i]!='O':
        FN += 1
        
    if predicted_str[i]=='-': #just ignore the '-' and move on to the next
        i+=1

print(TP, FP, TN, FN)

Output: 0 0 26 0

Answer 1

This is a strange one, but try to copy one of the 'O' character used in the actual_str or predicted_str variables, and paste that in your if-statements. I think there is a mismatch, even though they look identical.

Also the last if-statement is not necessary.

Answer 2

As commented before, the characters you are using are different, it mixes the greek letter O omicron and the Latin o capital.

https://apps.timwhitlock.info/unicode/inspect?s=%CE%9F

In addition instead of compare it by index, it make sense to use a zip operator in this usecase:

for (actual, predicted) in zip(actual_str, predicted_str):
   if (..

Calculating TP, FP, TN, FN values

Question

2 answers

solution1
1 ACCPTED 2021-05-05 22:08:45

solution2
0 2021-05-05 22:26:49

Calculating TP, FP, TN, FN values

Question

2 answers

solution1 1 ACCPTED 2021-05-05 22:08:45

solution2 0 2021-05-05 22:26:49

solution1
1 ACCPTED 2021-05-05 22:08:45

solution2
0 2021-05-05 22:26:49