简体   繁体   English

训练具有不同输入形状的预训练顺序 model

[英]Training a pre-trained sequential model with different input shape

I have a pre-trained sequential CNN model which I trained on images of 224x224x3.我有一个预训练的顺序 CNN model,我在 224x224x3 的图像上进行了训练。 The following is the architecture:以下是架构:

model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))

model.add(Dense(num_classes, activation = 'softmax'))     

model.summary()

For reference, here is the model summary: model summary作为参考,这里是 model 摘要: model 摘要

I want to retrain this model on images of size 40x40x3.我想在尺寸为 40x40x3 的图像上重新训练这个 model。 However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)".但是,我面临以下错误:“ValueError:dense_12 层的输入 0 与该层不兼容:输入形状的预期轴 -1 具有值 200704,但接收到形状的输入(无,256)”。 What should I do to resolve this error?我应该怎么做才能解决这个错误?

Note: I am using Tensorflow version 2.4.1注意:我使用的是 Tensorflow 版本 2.4.1

The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape ( line no 4 from last ), and then the output size is 128 for the dense layer (line 3 from the last).问题是,在您预先训练的 model 中,您有一个 200704 的扁平形状作为输入形状(最后第 4 行),然后密集层的 output 大小为 128(最后第 3 行)。 And now you wanna use the same pre-trained model for the image of 40X40, it will not work.现在你想对 40X40 的图像使用相同的预训练 model,它不会工作。 The reasons are:原因是:

1- Your model is input image shape-dependent. 1-您的 model 取决于输入图像的形状。 it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.它不是端到端的转换 model,因为您在中间使用密集层,这使得 model 输入图像大小依赖。

2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704. 2-所有转换层后 40x40 图像的展平尺寸为 256,而不是 200704。

Solution解决方案

1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. 1-要么使用自适应平均池化层更改展平部分,然后使用 softmax 的最后一个密集层就可以了。 And again retrain your old model on 224x224 images.再次在 224x224 图像上重新训练您的旧 model。 Following that you can train on your 40x40 images.之后,您可以在 40x40 图像上进行训练。

2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part ( exclude the flatten part ) and then add a flatten part with dense layer and classification layer (layer with softmax). 2-或者最简单的方法是只使用预训练的 model 的子集直到展平部分(不包括展平部分),然后添加一个具有密集层和分类层的展平部分(具有 softmax 的层)。 For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional.对于这种方法,您必须编写自定义 model,就像这里一样,只有第一部分将是预训练的 model 的子集,而展平和分类部分将是附加的。 And then you can train the whole model over the new dataset.然后您可以在新数据集上训练整个 model。 You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.您还可以使用此方法从迁移学习中受益,方法是允许反向梯度仅流过新创建的线性层,而不是流过预训练的层。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM