I want to count how many different value between two columns in dataframe for confusion matrix

Question

Here is a sample of my data:

import pandas as pd
data = {'tweet': ['saya suka makanan ini sangat enak', 'rasa kuahnya kurang enak, terlalu asin', 'favorit saya nih, ayam gorengnya enak banget', 'nasi bakar di toko ini enak banget!'],
        'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]} 
df = pd.DataFrame(data)

I want to count the values of True Positive, False Positive, True Negative, and False Negative between the actual_class and predicted_class columns in my dataframe without using scikit-learn . I tried to code it but I can't find the efficient way.

Answer 1

You can use the value counts function from pandas:

df['required column'].value_counts()

Answer 2

If you cannot use scikit-learn , but can use pandas , you might like pandas.crosstab :

import pandas as pd
data = {'actual_class': ["Positive", "Negative", "Positive", "Positive"], 'predicted_class': ["Positive", "Positive", "Negative", "Positive"]}
df = pd.DataFrame(data)

print(pd.crosstab(df.actual_class, df.predicted_class))

ie: you get the same solution you would with import sklearn; print(confusion_matrix(df.actual_class, df.predicted_class)) import sklearn; print(confusion_matrix(df.actual_class, df.predicted_class)) :

predicted_class  Negative  Positive
actual_class                       
Negative                0         1
Positive                1         2

I want to count how many different value between two columns in dataframe for confusion matrix

Question

2 answers

solution1
0 2022-12-16 03:51:16

solution2
0 2022-12-16 05:09:35

I want to count how many different value between two columns in dataframe for confusion matrix

Question

2 answers

solution1 0 2022-12-16 03:51:16

solution2 0 2022-12-16 05:09:35

solution1
0 2022-12-16 03:51:16

solution2
0 2022-12-16 05:09:35