How does data shape change during Conv2D and Dense in Keras?

Question

Just as the title says. This code only works Using:

x = Flatten()(x)

Between the convolutional layer and the dense layer.

import numpy as np
import keras
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Flatten, Input
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD

# Generate dummy data
x_train = np.random.random((100, 100, 100, 3))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)

#Build Model
input_layer = Input(shape=(100, 100, 3))
x = Conv2D(32, (3, 3), activation='relu')(input_layer)
x = Dense(256, activation='relu')(x)
x = Dense(10, activation='softmax')(x)
model = Model(inputs=[input_layer],outputs=[x])

#compile network
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

#train network
model.fit(x_train, y_train, batch_size=32, epochs=10)

Otherwise, I receive this error:

Traceback (most recent call last):

File "/home/michael/practice_example.py", line 44, in <module>
    model.fit(x_train, y_train, batch_size=32, epochs=10)

File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1435, in fit
    batch_size=batch_size)

File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1315, in _standardize_user_data
    exception_prefix='target')

File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 127, in _standardize_input_data
    str(array.shape))

ValueError: Error when checking target: expected dense_2 to have 4 dimensions, but got array with shape (100, 10)

Why would the output have 4 dimensions without the `flatten()` layer?

Answer 1

According to keras doc,

Conv2D Output shape

4D tensor with shape: (samples, filters, new_rows, new_cols) if data_format='channels_first' or 4D tensor with shape: (samples, new_rows, new_cols, filters) if data_format='channels_last'. rows and cols values might have changed due to padding.

Since you are using channels_last , the shape of layer output would be:

# shape=(100, 100, 100, 3)

x = Conv2D(32, (3, 3), activation='relu')(input_layer)
# shape=(100, row, col, 32)

x = Flatten()(x)
# shape=(100, row*col*32)    

x = Dense(256, activation='relu')(x)
# shape=(100, 256)

x = Dense(10, activation='softmax')(x)
# shape=(100, 10)

Error explanation (edited, thanks to @Marcin)

Linking a 4D tensor (shape=(100, row, col, 32)) to a 2D one (shape=(100, 256)) using Dense layer will still form a 4D tensor (shape=(100, row, col, 256)) which is not what you want.

# shape=(100, 100, 100, 3)

x = Conv2D(32, (3, 3), activation='relu')(input_layer)
# shape=(100, row, col, 32)

x = Dense(256, activation='relu')(x)
# shape=(100, row, col, 256)

x = Dense(10, activation='softmax')(x)
# shape=(100, row, col, 10)

And the error will occur when the mismatch between output 4D tensor and target 2D tensor happens.

That's why you need a Flatten layer to flat it from 4D to 2D.

Reference

Conv2D Dense

Answer 2

From Dense documentation one may read that in case when an input to a Dense has more than two dimensions - it's applied only to a last one - and all other dimensions are kept:

# shape=(100, 100, 100, 3)

x = Conv2D(32, (3, 3), activation='relu')(input_layer)
# shape=(100, row, col, 32)

x = Dense(256, activation='relu')(x)
# shape=(100, row, col, 256)

x = Dense(10, activation='softmax')(x)
# shape=(100, row, col, 10)

That's why a 4d target is expected.

How does data shape change during Conv2D and Dense in Keras?

Question

Why would the output have 4 dimensions without the `flatten()` layer?

2 answers

solution1
7 ACCPTED 2017-07-07 14:22:54

Error explanation (edited, thanks to @Marcin)

Reference

solution2
1 2017-07-07 14:27:20

How does data shape change during Conv2D and Dense in Keras?

Question

Why would the output have 4 dimensions without the flatten() layer?

2 answers

solution1 7 ACCPTED 2017-07-07 14:22:54

Error explanation (edited, thanks to @Marcin)

Reference

solution2 1 2017-07-07 14:27:20

Why would the output have 4 dimensions without the `flatten()` layer?

solution1
7 ACCPTED 2017-07-07 14:22:54

solution2
1 2017-07-07 14:27:20