简体   繁体   中英

Value error: too many values to pack - in a decision tree algorithm

I've copied the code that is used for data visualization in kaggle. I applied it with another dataset. When I was executing for confusion matrix, visualization etc. it shows value error: too many values to pack (expected 4). I've searched many websites and videos for this error, it explains only for simple python problems not for a visualization. I don't know what all the values are to be added and removed in this code to solve this error.

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
import collections

from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report, roc_auc_score, roc_curve, accuracy_score
conf_matrix_all = {}
a = []
def prediction(name, algo, training_x, testing_x, training_y, testing_y, plot) :
    global a
    algo.fit(training_x,training_y)                           # Fit the training data set to the algorithm passed.
    predictions = algo.predict(testing_x)                     # Get all predictions
    probabilities = algo.predict_proba(testing_x)             # Get probablities of predictions

    conf_matrix = confusion_matrix(testing_y, predictions)    # Get confusion matrix using the predictions
    tn, fp, fn, tp = conf_matrix.ravel()
    
    conf_matrix_all[name] = conf_matrix                       # Save confusion matrix values to a dictionary
    a = conf_matrix    
    
    print("Classification report:")                           # Print the classification report
    print(classification_report(testing_y, predictions))
  
    model_roc_auc = roc_auc_score(testing_y, predictions)           # Get the Area under the curve number
    fpr,tpr,thresholds = roc_curve(testing_y, probabilities[:,1])   # Get False postive rate and true positive rate

    print ("Area under the curve: ", model_roc_auc)
    print(accuracy_score(testing_y, predictions))
    
    if plot:
        fig, axes = plt.subplots(1,2, figsize=(25, 5))
        conf_matrix = np.flip(conf_matrix)
        
        conf_2 = conf_matrix.astype(str)
        labels = np.array([['\nTP','\nFN'],['\nFP','\nTN']])
        labels = np.core.defchararray.add(conf_2, labels)
        sns.heatmap(conf_matrix, fmt='', annot = labels, ax=axes[0], cmap="YlGnBu", xticklabels=[1, 0], yticklabels=[1, 0]);                                           # Plot the confusion matrix
        axes[0].set(xlabel='Predicted', ylabel='Actual')

        plt.title('Receiver Operating Characteristic')
        sns.lineplot(fpr, tpr, ax=axes[1])                                         # Plot the ROC curve
        plt.plot([0, 1], [0, 1],'--')                                              # Plot the diagonal line
        axes[1].set_xlim([0, 1])                                                   # Set x-axis limit to 0 and 1
        axes[1].set_ylim([0, 1])                                                   # Set y-axis limit to 0 and 1
        axes[1].set(xlabel = 'False Positive Rate', ylabel = 'True Positive Rate');
        plt.show();



dtc = DecisionTreeClassifier(criterion='gini', splitter='best', max_depth=10, min_samples_split=2, 
                             min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, 
                             random_state=None, max_leaf_nodes=None, min_impurity_decrease=0,class_weight=None, ccp_alpha=0.0)

prediction("Decision Tree", dtc, train_X, test_X, train_y, test_y, plot=True)



ValueError                                Traceback (most recent call last)
<ipython-input-75-79b3eb994e92> in <module>
      3                              random_state=None, max_leaf_nodes=None, min_impurity_decrease=0,class_weight=None, ccp_alpha=0.0)
      4 
----> 5 prediction("Decision Tree", dtc, train_X, test_X, train_y, test_y, plot=True)

<ipython-input-74-590eb3298a78> in prediction(name, algo, training_x, testing_x, training_y, testing_y, plot)
     14 
     15     conf_matrix = confusion_matrix(testing_y, predictions)    # Get confusion matrix using the predictions
---> 16     tn, fp, fn, tp = conf_matrix.ravel()
     17 
     18     conf_matrix_all[name] = conf_matrix                       # Save confusion matrix values to a dictionary

ValueError: too many values to unpack (expected 4)

I've tried to add.items() or.itervalues() as it mentioned in the videos and websites. i can't figure where it need to be attached.

I want output like this, the detailed classification report. Classification report:

          precision    recall  f1-score   support

       0       0.91      0.71      0.79        41
       1       0.76      0.93      0.84        41

accuracy                           0.82        82
macro avg      0.83      0.82      0.81        82
weighted avg   0.83      0.82      0.81        82

Area under the curve:  0.8170731707317073

在此处输入图像描述

I see you labeled the question with multiclass-classification . However, scikit-learns's confusion_matrix() returns an n_classes * n_classes matrix.

You can't assign 9 or 16 or however many values you have here to tn, fp, fn, tp . That terminology is for binary classification.

You can either calculate the metrics you want from the raw multiclass confusion matrix, or use some of the other methods in sklearn.metrics

Check the dimensions of the confusion matrix to ensure that it is a 2x2 matrix, and the dimensions of the predicted and actual labels to ensure that they match.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM