Keras - Negative dimension size caused by subtracting 5 from 4 for 'conv2d_5/convolution' (op: 'Conv2D') with input shapes: [?,4,80,64], [5,5,64,64]

Question

I have a similar model to the one below, but after modifying the architecture, I keep getting the following error:

Negative dimension size caused by subtracting 5 from 4 for 'conv2d_5/convolution' (op: 'Conv2D') with input shapes: [?,4,80,64], [5,5,64,64].

I am still new to machine learning so I couldn't make much sense of the parameters. Any help?

model_img = Sequential(name="img")
    # Cropping
    model_img.add(Cropping2D(cropping=((124,126),(0,0)), input_shape=(376,1344,3)))
    # Normalization
    model_img.add(Lambda(lambda x: (2*x / 255.0) - 1.0))
    model_img.add(Conv2D(16, (7, 7), activation="relu", strides=(2, 2)))
    model_img.add(Conv2D(32, (7, 7), activation="relu", strides=(2, 2)))
    model_img.add(Conv2D(32, (5, 5), activation="relu", strides=(2, 2)))
    model_img.add(Conv2D(64, (5, 5), activation="relu", strides=(2, 2)))
    model_img.add(Conv2D(64, (5, 5), activation="relu", strides=(2, 2)))
    model_img.add(Conv2D(128, (3, 3), activation="relu"))
    model_img.add(Conv2D(128, (3, 3), activation="relu"))
    model_img.add(Flatten())
    model_img.add(Dense(100))
    model_img.add(Dense(50))
    model_img.add(Dense(10))

    model_lidar = Sequential(name="lidar")
    model_lidar.add(Dense(32, input_shape=(360,)))
    model_lidar.add(Dropout(0.1))
    model_lidar.add(Dense(10))

    model_imu = Sequential(name='imu')
    model_imu.add(Dense(32, input_shape=(10, )))
    model_imu.add(Dropout(0.1))
    model_imu.add(Dense(10))

    merged = Merge([model_img, model_lidar, model_imu], mode="concat")
    model = Sequential()
    model.add(merged)
    model.add(Dense(16))
    model.add(Dropout(0.2))
    model.add(Dense(1))

Answer: I couldn't complete the training because of issues with sensor but the model works fine now thanks to the 2 answers below

Answer 1

Here is the output shapes of each layer in your model

(?, 376, 1344, 3) - Input
(?, 126, 1344, 3) - Cropping2D
(?, 126, 1344, 3) - Lambda
(?, 60, 669, 16)  - Conv2D 1
(?, 27, 332, 32)  - Conv2D 2
(?, 12, 164, 32)  - Conv2D 3
(?, 4, 80, 64)    - Conv2D 4

By the time the inputs have passed through the 4th Conv2D layer the output shape is already (4,80) . You cannot apply another Conv2D layer with filter size (5, 5) since the first dimension of your output is less than the filter size.

Answer 2

Your stack of convolutional layers reduces the image size quite fast. Therefore, once its size along one dimension is only 4, you cannot apply a 5x5-convolution anymore.

Without padding the output dimensions of a convolutional layer is (input_dimension - kernel_size)/strides . Substracting 7 (or 5) multiple times is not that important, but reducing the size by a factor of two gets the dimension down to 4 quite fast.

The solution is either not to use strides (after the first some layers). Adding padding helps against loosing size due to the kernel, but not due to strides.

Keras - Negative dimension size caused by subtracting 5 from 4 for 'conv2d_5/convolution' (op: 'Conv2D') with input shapes: [?,4,80,64], [5,5,64,64]

Question

2 answers

solution1
4 ACCPTED 2018-01-30 10:22:25

solution2
2 2018-01-30 10:20:03

Keras - Negative dimension size caused by subtracting 5 from 4 for 'conv2d_5/convolution' (op: 'Conv2D') with input shapes: [?,4,80,64], [5,5,64,64]

Question

2 answers

solution1 4 ACCPTED 2018-01-30 10:22:25

solution2 2 2018-01-30 10:20:03

solution1
4 ACCPTED 2018-01-30 10:22:25

solution2
2 2018-01-30 10:20:03