简体   繁体   中英

Training a cnn classifier to classify images of any input shape as input

I am trying to train a classifier of images with two classes and here is my neural net

model=Sequential()
model.add(tf.keras.layers.Conv2D(3,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(32,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(16,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(8,(3,3),activation="relu"))
model.add(Flatten())
model.add(Dense(2,activation="softmax"))

that works fine when all images are resized to a particular size.But i wish to train it with out resizing images when i remove the flatten layer my model is giving output for image of any sizewhereas when i use flatten layer with different image size it is giving me error the second time i use my model. Is there any alternative to replace flatten layer that does work on any input shape plaese let me know

You can't train a CNN without a fixed input shape, this will produce different feature map sizes. You must use a function to reshape all the input images :

import cv2

def my_resize(img, shape=(32, 32, 1)):
  return cv2.resize(img, dsize=shape)

inputs = [...] # list of input images

outputs = list(map(my_resize, inputs))

You can make a CNN without a pre-specifed input shape. You need to replace Flatten with GlobalMaxPool2D. This works because contrary to flatten, GlobalMaxPool2D gives an output tensor of size of feature maps present irrespective of the input shape of each feature map. Flatten freezes the size by converting the 2 dimensions to a single dimension output. The shape of each feature map is dependent on the initial input size but the number of feature maps is determined in the model. Specify the input shape as (None, None, channels) this will let the model know that the number of elements in this dimension is not constant (Just like batch training). The answer would seem a bit messy but in summary you have to do the following:

  1. Change Flatten to GlobalMaxPool2D
  2. Change input shape to (None, None, channels)- repeat None according to the number of image dimensions and number of channels is mandatory.

You can simply do this by changing the input shape to (None, None) . Acknowledge why this is so: a convolutional layer is learning the parameters of the filter, whose shape is entirely independent of the shape of the input matrix. But, you must ensure that you train the model on images of varying shape, otherwise, the weights learned will not generalize well for other input shapes. Simply upscaling low-resolution images typically would not suffice; you need actual data of varying shape. However, downscaling high-res data should work fine I suppose.

This is not possible. If you add even a simple perceptron then it will have to be initialised with the weights and it will be completely random. Practically, you cant do what you are intending to do. So you should resize it before sending it into the model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM