I've copied the code that is used for data visualization in kaggle. I applied it with another dataset. When I was executing for confusion matrix, visualization etc. it shows value error: too many values to pack (expected 4). I've searched many websites and videos for this error, it explains only for simple python problems not for a visualization. I don't know what all the values are to be added and removed in this code to solve this error.
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
import collections
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report, roc_auc_score, roc_curve, accuracy_score
conf_matrix_all = {}
a = []
def prediction(name, algo, training_x, testing_x, training_y, testing_y, plot) :
global a
algo.fit(training_x,training_y) # Fit the training data set to the algorithm passed.
predictions = algo.predict(testing_x) # Get all predictions
probabilities = algo.predict_proba(testing_x) # Get probablities of predictions
conf_matrix = confusion_matrix(testing_y, predictions) # Get confusion matrix using the predictions
tn, fp, fn, tp = conf_matrix.ravel()
conf_matrix_all[name] = conf_matrix # Save confusion matrix values to a dictionary
a = conf_matrix
print("Classification report:") # Print the classification report
print(classification_report(testing_y, predictions))
model_roc_auc = roc_auc_score(testing_y, predictions) # Get the Area under the curve number
fpr,tpr,thresholds = roc_curve(testing_y, probabilities[:,1]) # Get False postive rate and true positive rate
print ("Area under the curve: ", model_roc_auc)
print(accuracy_score(testing_y, predictions))
if plot:
fig, axes = plt.subplots(1,2, figsize=(25, 5))
conf_matrix = np.flip(conf_matrix)
conf_2 = conf_matrix.astype(str)
labels = np.array([['\nTP','\nFN'],['\nFP','\nTN']])
labels = np.core.defchararray.add(conf_2, labels)
sns.heatmap(conf_matrix, fmt='', annot = labels, ax=axes[0], cmap="YlGnBu", xticklabels=[1, 0], yticklabels=[1, 0]); # Plot the confusion matrix
axes[0].set(xlabel='Predicted', ylabel='Actual')
plt.title('Receiver Operating Characteristic')
sns.lineplot(fpr, tpr, ax=axes[1]) # Plot the ROC curve
plt.plot([0, 1], [0, 1],'--') # Plot the diagonal line
axes[1].set_xlim([0, 1]) # Set x-axis limit to 0 and 1
axes[1].set_ylim([0, 1]) # Set y-axis limit to 0 and 1
axes[1].set(xlabel = 'False Positive Rate', ylabel = 'True Positive Rate');
plt.show();
dtc = DecisionTreeClassifier(criterion='gini', splitter='best', max_depth=10, min_samples_split=2,
min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None,
random_state=None, max_leaf_nodes=None, min_impurity_decrease=0,class_weight=None, ccp_alpha=0.0)
prediction("Decision Tree", dtc, train_X, test_X, train_y, test_y, plot=True)
ValueError Traceback (most recent call last)
<ipython-input-75-79b3eb994e92> in <module>
3 random_state=None, max_leaf_nodes=None, min_impurity_decrease=0,class_weight=None, ccp_alpha=0.0)
4
----> 5 prediction("Decision Tree", dtc, train_X, test_X, train_y, test_y, plot=True)
<ipython-input-74-590eb3298a78> in prediction(name, algo, training_x, testing_x, training_y, testing_y, plot)
14
15 conf_matrix = confusion_matrix(testing_y, predictions) # Get confusion matrix using the predictions
---> 16 tn, fp, fn, tp = conf_matrix.ravel()
17
18 conf_matrix_all[name] = conf_matrix # Save confusion matrix values to a dictionary
ValueError: too many values to unpack (expected 4)
I've tried to add.items() or.itervalues() as it mentioned in the videos and websites. i can't figure where it need to be attached.
I want output like this, the detailed classification report. Classification report:
precision recall f1-score support
0 0.91 0.71 0.79 41
1 0.76 0.93 0.84 41
accuracy 0.82 82
macro avg 0.83 0.82 0.81 82
weighted avg 0.83 0.82 0.81 82
Area under the curve: 0.8170731707317073
I see you labeled the question with multiclass-classification
. However, scikit-learns's confusion_matrix()
returns an n_classes * n_classes matrix.
You can't assign 9 or 16 or however many values you have here to tn, fp, fn, tp
. That terminology is for binary classification.
You can either calculate the metrics you want from the raw multiclass confusion matrix, or use some of the other methods in sklearn.metrics
Check the dimensions of the confusion matrix to ensure that it is a 2x2 matrix, and the dimensions of the predicted and actual labels to ensure that they match.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.