繁体   English   中英

如何使用自己的模型实现迁移学习

[英]How to implement Transfer-Learning with own model

我正在尝试保存我实现的 CNN 模型,并用它来执行转移学习(TL)。 我想澄清以下四点。
1.(CNN代码)模型保存方式是否正确。
2.(TL代码)模型加载方式是否正确。
3.(TL代码)加载模型的“trainable”属性通常是如何设置的?
4.(TL代码)预训练模型和后续层是否正确组合(大小等)

以下是CNN的模型部分和迁移学习代码。
两者都是预测两个数字的回归模型。 输入是图像(+ 监督数据)。


#CNN

input = Input(shape=(100,100,3))
conv_0 = Conv2D(32,kernel_size=3,activation='relu')(input)
pool_0 = MaxPooling2D(pool_size=(2,2))(conv_0)
pool_0 = Dropout(0.25)(pool_0)
conv_1 = Conv2D(64,kernel_size=3,activation='relu')(pool_0)
pool_1 = MaxPooling2D(pool_size=(2,2))(conv_1)
pool_1 = Dropout(0.25)(pool_1)
conv_2 = Conv2D(32,kernel_size=3,activation='relu')(pool_1)
pool_2 = MaxPooling2D(pool_size=(2,2))(conv_2)
pool_2 = Dropout(0.25)(pool_2)
conv_3 = Conv2D(16,kernel_size=3,activation='relu')(pool_2)
pool_3 = MaxPooling2D(pool_size=(2,2))(conv_3)
conv_4 = Conv2D(8,kernel_size=3,activation='relu')(pool_3)
pool_4 = MaxPooling2D(pool_size=(2,2))(conv_4)
flat = Flatten()(pool_4)
denseL = Dense(64,activation='relu')(flat)
denseL = Dropout(0.25)(denseL)
A_output = Dense(1,name="a")(denseL)
B_output = Dense(1,name="b")(denseL)

model = Model(inputs=input, outputs=[A_output,B_output])
model.compile(Adam(learning_rate=0.001),
              loss = {'a':'mae','b':'mae'} ,
              metrics =  {'a':'mae','b':'mae'})

history = model.fit([np.array(Img_train)],[np.array(LabelA_train),np.array(LabelB_train)],
                    epochs=100, batch_size=16,
                    validation_data=([np.array(Img_test)],[np.array(LabelA_test),np.array(LabelB_test)]))

model.save('forTransferL.h5')

"""
Outputs for sumally()
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            [(None, 100, 100, 3) 0
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 98, 98, 32)   896         input_1[0][0]
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 49, 49, 32)   0           conv2d[0][0]
__________________________________________________________________________________________________
dropout (Dropout)               (None, 49, 49, 32)   0           max_pooling2d[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 47, 47, 64)   18496       dropout[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 23, 23, 64)   0           conv2d_1[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 23, 23, 64)   0           max_pooling2d_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 21, 21, 32)   18464       dropout_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 10, 10, 32)   0           conv2d_2[0][0]
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 10, 10, 32)   0           max_pooling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 8, 8, 16)     4624        dropout_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 4, 4, 16)     0           conv2d_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 2, 2, 8)      1160        max_pooling2d_3[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 1, 1, 8)      0           conv2d_4[0][0]
__________________________________________________________________________________________________
flatten (Flatten)               (None, 8)            0           max_pooling2d_4[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, 64)           576         flatten[0][0]
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 64)           0           dense[0][0]
__________________________________________________________________________________________________
a (Dense)                       (None, 1)            65          dropout_3[0][0]
__________________________________________________________________________________________________
b (Dense)                       (None, 1)            65          dropout_3[0][0]
=========================================================================================
"""

#TL

model = load_model('forTransferL.h5')
model.layers[0].trainable = False
x = model.layers[10].output

# The following is the same as part of CNN model.
conv_3 = Conv2D(16,kernel_size=3,activation='relu')(x)
pool_3 = MaxPooling2D(pool_size=(2,2))(conv_3)
conv_4 = Conv2D(8,kernel_size=3,activation='relu')(pool_3)
pool_4 = MaxPooling2D(pool_size=(2,2))(conv_4)
flat = Flatten()(pool_4)
denseL = Dense(64,activation='relu')(flat)
denseL = Dropout(0.25)(denseL)
A_output = Dense(1,name="a")(denseL)
B_output = Dense(1,name="b")(denseL)

model=Model(inputs=model.input,outputs=[A_output,B_output])

以防万一,我还包括当前出现在下面的错误文本,
但我认为理解基本实现部分比修复错误更重要。

谢谢您的合作。

#error message

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:/Users/userABC/OneDrive/Document/StudyAI/transferMymodel.py", line 158, in <module>
    pool_m4 = MaxPooling2D(pool_size=(2,2))(conv_m4)
  File "C:\Users\userABC\anaconda3\lib\site-packages\keras\engine\base_layer.py", line 1006, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  〜Omitted due to the limited number of characters.

ValueError: Negative dimension size caused by subtracting 2 from 1 for '{{node tf.compat.v1.nn.max_pool_1/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", explicit_paddings=[], ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](Placeholder)' with input shapes: [?,1,1,8].

对于第 1 点和第 2 点,您的代码似乎足够正确,尽管当涉及具有自定义层的模型时,我个人更喜欢加载模型权重而不是模型本身。 不过,这对你的情况并不重要。

对于第 3 点,请参阅https://keras.io/guides/transfer_learning/

如果您在模型或任何具有子层的层上设置 trainable = False,则所有子层也将变为不可训练。

这意味着对于您的情况,设置“model.trainable = False”将冻结模型加载部分的所有权重(防止其被更改)。

对于第 4 点,预训练模型和后面的层看起来组合正确,但导致错误的原因是您没有在 Conv2D 层中设置padding='same' (例如:

conv_3 = Conv2D(16,kernel_size=3,activation='relu', padding='same')(x)

这很重要,因为不指示相同的填充意味着每个 Conv2D 层将图像的高度和宽度缩小 2,并且鉴于您的预训练模型的输出形状为 (8, 8, 16),第一个卷积将产生 ( 6, 6, 16),第一个最大池将产生 (3, 3, 16),第二个卷积将产生 (1, 1, 8),此时,第二个最大池不能再池化特征,因为高度和宽度小于 (2, 2)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM