computing performance matrix manually

Question

I have an assignment problem where I have to use y and y_score columns in a .csv file where y is the actual score and y_score is the output one. I have to consider unique y_score values and sort them in ascending order and use each of this as threshold value and compare with actual y_score to find y_predicted . So, y_predicted = [0 if y_score < threshold else 1] . from this find the False positive and false negative to calculate A = 500*number of false negative + 100* number of false positive . Finally find the lowest value of metric A.

This is what I've written:

...

import pandas as pd

data2 = pd.read_csv('5_c.csv')

uniq2= data2['prob'].unique()

uniq2.sort()

aa=[]

for k,v in enumerate(uniq2):

    data2['thr'] = v

    for j,l in enumerate(data2['prob']):

      if l>v:

        data2['pred'] = 0
    else:
        data2['pred']=1

    df = data2[(data2['y']==0)&(data2['pred']==1)]

    FP = df.shape
    df = data2[(data2['y']==1)&(data2['pred']==1)]
    FN = df.shape
    A=100*FN+500*FP
    aa.append(A)
m=np.argmin(aa)
print(m)

...

This is a sample of my csv file

y     prob
0   0.458521

0   0.505037

0   0.418652

0   0.412057

0   0.375579

0   0.595387

0   0.370288

Answer 1

import pandas as pd
import numpy as np

df = pd.DataFrame({'y':[0,0,0,0,1,1,0],'prob':[0.458521,0.505037,0.418652,0.412057,0.375579,0.595387,0.370288]})

un = df['prob'].unique()
aa = []
for i in un:
    fp = sum((df['y']==0)&((df['prob']>i)==1))
    fn = sum((df['y']==1)&((df['prob']>i)==0))
    A = 500*fn+100*fp
    aa.append(A)
print(un[np.argmin(aa)])

computing performance matrix manually

Question

1 answers

solution1
0 2020-02-26 07:12:34

computing performance matrix manually

Question

1 answers

solution1 0 2020-02-26 07:12:34

solution1
0 2020-02-26 07:12:34