简体   繁体   中英

computing performance matrix manually

I have an assignment problem where I have to use y and y_score columns in a .csv file where y is the actual score and y_score is the output one. I have to consider unique y_score values and sort them in ascending order and use each of this as threshold value and compare with actual y_score to find y_predicted . So, y_predicted = [0 if y_score < threshold else 1] . from this find the False positive and false negative to calculate A = 500*number of false negative + 100* number of false positive . Finally find the lowest value of metric A.

This is what I've written:

...

import pandas as pd

data2 = pd.read_csv('5_c.csv')

uniq2= data2['prob'].unique()

uniq2.sort()

aa=[]

for k,v in enumerate(uniq2):

    data2['thr'] = v

    for j,l in enumerate(data2['prob']):

      if l>v:

        data2['pred'] = 0
    else:
        data2['pred']=1

    df = data2[(data2['y']==0)&(data2['pred']==1)]

    FP = df.shape
    df = data2[(data2['y']==1)&(data2['pred']==1)]
    FN = df.shape
    A=100*FN+500*FP
    aa.append(A)
m=np.argmin(aa)
print(m)

...

This is a sample of my csv file

y     prob
0   0.458521

0   0.505037

0   0.418652

0   0.412057

0   0.375579

0   0.595387

0   0.370288
import pandas as pd
import numpy as np

df = pd.DataFrame({'y':[0,0,0,0,1,1,0],'prob':[0.458521,0.505037,0.418652,0.412057,0.375579,0.595387,0.370288]})

un = df['prob'].unique()
aa = []
for i in un:
    fp = sum((df['y']==0)&((df['prob']>i)==1))
    fn = sum((df['y']==1)&((df['prob']>i)==0))
    A = 500*fn+100*fp
    aa.append(A)
print(un[np.argmin(aa)])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM