简体   繁体   中英

Change parameters in Convolution neural network

I'm praticing CNNs. I read some papers about training MNIST dataset use CNNs.size of image is 28x28 and use architecture 5 layers: input>conv1-maxpool1>conv2-maxpool2>fully connected>output

Convolutional Layer #1
- Computes 32 features using a 5x5 filter with ReLU activation.
- Padding is added to preserve width and height.
- Input Tensor Shape: [batch_size, 28, 28, 1]
- Output Tensor Shape: [batch_size, 28, 28, 32] 
Pooling Layer #1
- First max pooling layer with a 2x2 filter and stride of 2
- Input Tensor Shape: [batch_size, 28, 28, 32]
- Output Tensor Shape: [batch_size, 14, 14, 32] 
Convolutional Layer #2
- Computes 64 features using a 5x5 filter.
- Padding is added to preserve width and height.
- Input Tensor Shape: [batch_size, 14, 14, 32]
- Output Tensor Shape: [batch_size, 14, 14, 64] 
Pooling Layer #2
- Second max pooling layer with a 2x2 filter and stride of 2
- Input Tensor Shape: [batch_size, 14, 14, 64]
- Output Tensor Shape: [batch_size, 7, 7, 64] 
Flatten tensor into a batch of vectors
- Input Tensor Shape: [batch_size, 7, 7, 64]
- Output Tensor Shape: [batch_size, 7 * 7 * 64] 
Fully Connected Layer
- Densely connected layer with 1024 neurons
- Input Tensor Shape: [batch_size, 7 * 7 * 64]
- Output Tensor Shape: [batch_size, 1024] Output layer
- Input Tensor Shape: [batch_size, 1024]
- Output Tensor Shape: [batch_size, 10]

In conv1, with 1 input computates 32 features using a 5x5 filter and in conv2 with 32 input from conv1 computates 64 features using same filter. What are parameters such as 32,64,2x2 filter chosen based on? Do they based on size of image?

If size of images is larger than 28x28 such as 128x128. Should I increse the number of layers over 5 layers? How are above parameters changed with other size of images?

Thank advance

At it's base level, those inputs are called HyperParameters and are not typically defined by any particular set of rules. That said, often we use rules of thumb (heuristics) to choose a set of hyper parameters and then use hyperparameter optimisation to increase performance or efficiency etc.

A great explanation of this is Here

Edit: Further info in this paper - it's a widely studied problem, look into Arxiv and Stats.Stackexchange for more info but here is a great paper I used when I was learning Here

What are parameters such as 32,64,2x2 filter chosen based on? Do they based on size of image?

The parameters that you have mentioned (32,64,2x2) are number of filters for a convolutional layer and a filter size. They are the hyperparameters that you can select and adjust as you train your models. Depending on your dataset, application and model performance you can control them.

For a number of filters, you use it to control number of features that your model learns. In your model, your filter number increase from 32 to 64 after maxpooling layer. Maxpooling layer with 2x2 filter will reduce the number of features by half. And by increasing the filter numbers by double, it will keep the same number of features in the model. In convention, after a 2x2 maxpooling layer, the filter number will double for this reason.

And for the filter size, if it's for maxpooling layer, it will determine the size of feature reduction. If it's for convolutional layer, it will determine how detail the input images are learned. For example, if you are trying to work with images where small pixels or features differciate objects, you would choose small filter size such as 3x3 or 5x5. And vise versa for large filter size.

One way of understanding these hyperparameters is that understand how these affect learning of the model and you will know how to control them depending on each case. And another way is to look at how these are set for models used by other people. You may find some conventions such as filter number increasing after maxpooling layer.

If size of images is larger than 28x28 such as 128x128. Should I increse the number of layers over 5 layers? How are above parameters changed with other size of images?

And about layers, having more layers will make your model deeper and will result in more parameters. This means that your model will become more complex and become able to learn more about image features. So, having a deep architecture can benefit learning images with large resolutions. Because large resolution means there are many features to learn. However, this also will depend case by case. Good approach to this would be starting with simple model with few layers and gradually increase your layers while it benefits your model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM