简体   繁体   中英

Keras confusion matrix: ValueError: Classification metrics can't handle a mix of multiclass-multioutput and binary targets

I am pretty sure I am doing some simple, incorrectly. I added a confusion matrix to my code recently and it gives the error "ValueError: Classification metrics can't handle a mix of multiclass-multioutput and binary targets". The y training values are encoded "ground truth." I assume my targets should be binary outputs or should they also be one-hot encoded?

Am I maybe using the wrong loss attribute? thoughts?


def get_model(word_length):
    dim1 = 28
    dim2 = 28
    input_signal = Input(shape=(dim1, dim2, 2))
    x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_signal)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Flatten()(x)
    x = Dense(128, activation='relu')(x)
    x = Dense(word_length, activation='softmax')(x)

    model = Model(inputs=input_signal, outputs=x)
    model.summary()
    return model


if __name__ == "__main__":

    mlflow.keras.autolog()

trainPath = "Data/"
totalsamples = 0

train = get_data(trainPath)
X = train.path
labelbinarizer = LabelBinarizer()
y = labelbinarizer.fit_transform(train.word)

X, Xt, y, yt = train_test_split(X, y, test_size=0.3, stratify=y)
batchsize = 10

train_gen = batch_generator(X, y, batchsize)
valid_gen = batch_generator(Xt, yt, batchsize)

tensorboard = TensorBoard(log_dir='./logs/{}'.format(time.time()),
                          batch_size=32)

word_length = len(train.word.unique())
model = get_model(word_length)
model.compile(loss='categorical_crossentropy',
              optimizer=Adam(), metrics=['accuracy'])


step_train_gen = X.shape[0] // batchsize
step_valid_gen = Xt.shape[0] // batchsize

steps = step_train_gen
valid_steps = step_valid_gen

history = model.fit_generator(
            generator=train_gen,
            epochs=7,
            steps_per_epoch=steps,
            validation_data=valid_gen,
            validation_steps=valid_steps,
            callbacks=[tensorboard])

history = history.history
print('Validation accuracy: {acc}, loss: {loss}'.format(
        acc=history['val_acc'][-1], loss=history['val_loss'][-1]))


predictions = model.predict_generator(valid_gen, verbose=0, steps=valid_steps)

y_pred = np.argmax(predictions, axis=1)

print(confusion_matrix(labels, y_pred))

Since I do not see the full code, I am assuming the following:

  1. confusion_matrix is imported from sklearn.metrics
  2. labels = yt

Now, confusion_matrix does not like one-hot encoded inputs. Since your model output is one-hot encoded, following should resolve your issue:

y_pred_labels = y_pred.argmax(1)
confusion_matrix(yt, y_pred_labels)

Note the following runs indicating the formats supported by the implementation:

# 1. Test run with labels
In [1]: from sklearn.metrics import confusion_matrix                                                                                                                                                               
In [2]: y_true = [2, 0, 2, 2, 0, 1]                                                                                                                                                                                
In [3]: y_pred = [0, 0, 2, 2, 0, 2]                                                                                                                                                                                
In [4]: confusion_matrix(y_true, y_pred)                                                                                                                                                                           
Out[4]: 
array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])

# 2. Test run with one-hot encoded vectors
In [5]: y_true = [[1,0],[0,0], [1,0], [1,0], [0, 0], [0, 1]]
In [6]: y_pred = [[0, 0], [0, 0], [1, 0], [1, 0], [0, 0], [1, 0]]
In [7]: confusion_matrix(y_true, y_pred)                                                                                                                                                                           
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-1a5d7236d395> in <module>
----> 1 confusion_matrix(y_true, y_pred)
ValueError: multiclass-multioutput is not supported

# 3. Test run with categorical input
In [8]: y_pred = ["zero", "zero", "two", "two", "zero", "two"]    
In [9]: y_true = ["two", "zero", "two", "two", "zero", "one"]
In [10]: confusion_matrix(y_true, y_pred, labels=["zero", "one", "two"])                                                                                                                                           
Out[10]: 
array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM