简体   繁体   中英

I want to count how many different value between two columns in dataframe for confusion matrix

Here is a sample of my data:

import pandas as pd
data = {'tweet': ['saya suka makanan ini sangat enak', 'rasa kuahnya kurang enak, terlalu asin', 'favorit saya nih, ayam gorengnya enak banget', 'nasi bakar di toko ini enak banget!'],
        'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]} 
df = pd.DataFrame(data)

I want to count the values of True Positive, False Positive, True Negative, and False Negative between the actual_class and predicted_class columns in my dataframe without using scikit-learn . I tried to code it but I can't find the efficient way.

You can use the value counts function from pandas:

df['required column'].value_counts()

If you cannot use scikit-learn , but can use pandas , you might like pandas.crosstab :

import pandas as pd
data = {'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]}
df = pd.DataFrame(data)

print(pd.crosstab(df.actual_class, df.predicted_class))

ie: you get the same solution you would with import sklearn; print(confusion_matrix(df.actual_class, df.predicted_class)) import sklearn; print(confusion_matrix(df.actual_class, df.predicted_class)) :

predicted_class  Negative  Positive
actual_class                       
Negative                0         1
Positive                1         2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM