简体   繁体   中英

keras sparse_categorical_crossentropy loss function output shape didn't match

I have a dataset which has 3570 labels. When I use the sparse_categorical_crossentropy as the loss function, the output shape didn't match.

model = Sequential()
model.add(Dense(1024, input_dim=79, activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(3570, activation='sigmoid'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])
model.fit(x_train, y_train,
          epochs=10,
          batch_size=1,
          validation_data=(x_valid, y_valid))

and the output is ValueError: Error when checking model target: expected dense_42 to have shape (None, 1) but got array with shape (1055, 3570)

Then I fount this issue#2444 and used np.expand_dims(y, -1) to change the code. But there was still have error.

model = Sequential()
model.add(Dense(1024, input_dim=79, activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(3570, activation='sigmoid'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])
model.fit(x_train, np.expand_dims(y_train, -1),
          epochs=10,
          batch_size=1,
          validation_data=(x_valid, np.expand_dims(y_valid, -1)))

the error ValueError: Error when checking model target: expected dense_45 to have 2 dimensions, but got array with shape (1055, 3570, 1)

How should I change the code?

What are the original y_train dimensions?

Most likely is that your y_train has the shape (1055,). You need to One-Hot code y_train into a (1055,3570) dimension. Then the original code should work. Keras does not accept a single column of y using multiple classes, it has to be One-Hot coded.

You can find use the following:

from keras.utils.np_utils import to_categorical

y_cat = to_categorical(y, num_classes=None)

loss='sparse_categorical_crossentropy' is not meant for one-hot encodings but for integer targets. You probably need a "Dense(..." as the output layer and use y_train directly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM