Incompatible shapes: [128,1] vs. [128,3,3]

Question

I am trying to create a CNN to classify the SVHN dataset but run into an incompatible shape error when creating my model: Incompatible shapes: [128,3,3,10] vs. [128,1]. How do I fix it?

         model = Sequential([
                          Conv2D(filters=8, kernel_size=(3, 3), 
                           activation='relu', input_shape=(32, 32,3 
                           name='conv_1'),
                            
                           Conv2D(filters=8, kernel_size=(3, 3), 
                          activation='relu', padding= 'SAME',  
                                   `name='conv_2'),
                           MaxPooling2D(pool_size=(8, 8), name='pool_1'),
                            Dense(64,  kernel_regularizer = 
                            regularizers.l2(0.5),bias_initializer='ones',
                            activation='relu' , name='dense_1'),
                            Dropout(0.3),  
                            Dense(64,kernel_regularizer = 
                            regularizers.l2(0.5) , activation='relu' 
                           ,name='dense_2'),
                            BatchNormalization(),  
                            Dense(64,  kernel_regularizer = 
                            regularizers.l2(0.5) ,  activation='relu' 
                             ,name='dense_3'),
                             Dense(10,  activation='softmax' 
                             ,name='dense_4')
             ])


           model.compile(
           optimizer = 'adam',
           loss = 'sparse_categorical_crossentropy',
           metrics= ['accuracy' ])
   

           history = model.fit(train_images,train_labels , epochs = 30 
           ,validation_split = 0.15,
                batch_size= 128, verbose = False )

Answer 1

Put a Flatten layer before the last Dense layer. Because you are not doing that, that is why the tensor is not reduced to a single dimension tensor before the layer that gives you the class.

It is a general pattern in TensorFlow to use a Flatten before the layer that outputs that class.

Also, I have removed the BatchNormalization in the random dense layer you had put, BatchNormalization layer is put generally after a Conv layer, you can put them after a Dense layer, though. If you are BatchNormalization, make sure the whole network or the relevant has it. Don't just put one random BatchNormalization layer.

Here's how you change your code to do that.

 model = Sequential([Conv2D(filters=8, kernel_size=(3, 3), 
                           activation='relu', input_shape=(32, 32,3), 
                           name='conv_1'),
                     BatchNormalization(),
                     Conv2D(filters=8, kernel_size=(3, 3), 
                          activation='relu', padding= 'SAME',  
                          name='conv_2'),
                     BatchNormalization(),
                     MaxPooling2D(pool_size=(8, 8), name='pool_1'),
                     Flatten(),
                     Dense(64,  kernel_regularizer = 
                           regularizers.l2(0.5), bias_initializer='ones',
                           activation='relu' , name='dense_1'),
                     Dropout(0.3),  
                     Dense(64,kernel_regularizer = 
                           regularizers.l2(0.5) , activation='relu', 
                           name='dense_2'),
                     Dense(64,  kernel_regularizer = 
                           regularizers.l2(0.5) ,  activation='relu', 
                           name='dense_3'),
                     Dense(10,  activation='softmax', 
                           name='dense_4')
             ])


           model.compile(
           optimizer = 'adam',
           loss = 'sparse_categorical_crossentropy',
           metrics= ['accuracy' ])
   

           history = model.fit(train_images,train_labels , epochs = 30)

Answer 2

I think there are two problems in your code.

At first, please check the shape of train_labels . Incompatible shapes error would appear if the shapes of tensor are wrong. I think that shape of [128, 1] means that the train_label is not kind of one-hot vector. If train_label shapes are such as [1, 3, 8, ..], then you should change the shapes into [[0, 1, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0, 0, 0, 0], ...]

Second, you should add Flatten layer before Dense layer as mentioned above.

 model = Sequential([
       Conv2D(filters=8, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 3), name='conv_1'),
       ...
       MaxPooling2D(pool_size=(8, 8), name='pool_1'),
       
       Flatten(),

       Dense(64,  kernel_regularizer ...
             ])

The shape of (32, 32, 1) means that the last dim of input shape should be one. so you should change the input_shape of Conv2D into (32, 32, 1)

Conv2D(filters=8, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 1) ...

Also, the train_images should be also changed into (32, 32, 1) because the channel of images is one.

train_images = tf.expand_dims(train_images, -1)

Additionally, you can get one-hot vector of train_labels like that:

train_labels = tf.squeeze(train_labels)
train_labels = tf.one_hot(train_labels, depth=10)

Incompatible shapes: [128,1] vs. [128,3,3]

Question

2 answers

solution1
0 2020-07-27 06:45:27

solution2
0 2020-07-27 07:12:45

Incompatible shapes: [128,1] vs. [128,3,3]

Question

2 answers

solution1 0 2020-07-27 06:45:27

solution2 0 2020-07-27 07:12:45

solution1
0 2020-07-27 06:45:27

solution2
0 2020-07-27 07:12:45