简体   繁体   English

从预训练的 model 中移除顶层,迁移学习,tensorflow (load_model)

[英]Remove top layer from pre-trained model, transfer learning, tensorflow (load_model)

I have pre-trained a model (my own saved model) with two classes, which I want to use for transfer learning to train a model with six classes.我已经预训练了一个带有两个类的 model(我自己保存的模型),我想用它来进行迁移学习来训练一个有六个类的 model。 I have loaded the pre-trained model into the new training script:我已将预训练的 model 加载到新的训练脚本中:

base_model = tf.keras.models.load_model("base_model_path")

How can I remove the top/head layer (a conv1D layer)?如何删除顶层/头层(conv1D 层)?

I see that in keras one can use base_model.pop(), and for tf.keras.applications one can simply use include_top=false but is there something similar when using tf.keras and load_model?我看到在 keras 中可以使用 base_model.pop(),而对于 tf.keras.applications 可以简单地使用include_top=false但是在使用 tf.Z063009BB15C8272BD0C701CF0 和 load_ZDF 时有类似的东西吗?

(I have tried something like this: (我尝试过这样的事情:

for layer in base_model.layers[:-1]:
    layer.trainable = False`

and then add it to a new model (?) but I am not sure on how to continue)然后将其添加到新的 model (?)但我不确定如何继续)

Thanks for any help!谢谢你的帮助!

You could try something like this:你可以尝试这样的事情:

The base model is made up of a simple Conv1D network with an output layer with two classes:基础 model 由一个简单的Conv1D网络组成,该网络具有一个 output 层和两个类:

import tensorflow as tf

samples = 100
timesteps = 5
features = 2
classes = 2
dummy_x, dummy_y = tf.random.normal((100, 5, 2)), tf.random.uniform((100, 1), maxval=2, dtype=tf.int32)

base_model = tf.keras.Sequential()
base_model.add(tf.keras.layers.Conv1D(32, 3, activation='relu', input_shape=(5, 2)))
base_model.add(tf.keras.layers.GlobalMaxPool1D())
base_model.add(tf.keras.layers.Dense(32, activation='relu'))
base_model.add( tf.keras.layers.Dense(classes, activation='softmax'))

base_model.compile(optimizer='adam', loss = tf.keras.losses.SparseCategoricalCrossentropy())
print(base_model.summary())
base_model.fit(dummy_x, dummy_y, batch_size=16, epochs=1)
base_model.save("base_model")
base_model = tf.keras.models.load_model("base_model")
Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv1d_31 (Conv1D)          (None, 3, 32)             224       
                                                                 
 global_max_pooling1d_13 (Gl  (None, 32)               0         
 obalMaxPooling1D)                                               
                                                                 
 dense_17 (Dense)            (None, 32)                1056      
                                                                 
 dense_18 (Dense)            (None, 2)                 66        
                                                                 
=================================================================
Total params: 1,346
Trainable params: 1,346
Non-trainable params: 0
_________________________________________________________________
None
7/7 [==============================] - 0s 3ms/step - loss: 0.6973
INFO:tensorflow:Assets written to: base_model/assets

The new model is also is made up of a simple Conv1D network, but with an output layer with six classes.新的 model 也是由一个简单的Conv1D网络组成,但有一个包含六个类的 output 层。 It also contains all the layers of the base_model except the first Conv1D layer and the last output layer:它还包含base_model的所有层,除了第一个Conv1D层和最后一个 output 层:

classes = 6
dummy_x, dummy_y = tf.random.normal((100, 5, 2)), tf.random.uniform((100, 1), maxval=6, dtype=tf.int32)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(64, 3, activation='relu', input_shape=(5, 2)))
model.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
for layer in base_model.layers[1:-1]: # Skip first and last layer
  model.add(layer)
model.add(tf.keras.layers.Dense(classes, activation='softmax'))
model.compile(optimizer='adam', loss = tf.keras.losses.SparseCategoricalCrossentropy())
print(model.summary())
model.fit(dummy_x, dummy_y, batch_size=16, epochs=1)
Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv1d_32 (Conv1D)          (None, 3, 64)             448       
                                                                 
 conv1d_33 (Conv1D)          (None, 2, 32)             4128      
                                                                 
 global_max_pooling1d_13 (Gl  (None, 32)               0         
 obalMaxPooling1D)                                               
                                                                 
 dense_17 (Dense)            (None, 32)                1056      
                                                                 
 dense_19 (Dense)            (None, 6)                 198       
                                                                 
=================================================================
Total params: 5,830
Trainable params: 5,830
Non-trainable params: 0
_________________________________________________________________
None
7/7 [==============================] - 0s 3ms/step - loss: 1.8069
<keras.callbacks.History at 0x7f90c87a3c50>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用预训练的 model 进行双输入迁移学习 - how to use pre-trained model for dual input transfer learning 在迁移学习预训练模型上训练新数据集 - Train new dataset on transfer learning pre-trained model 使用预先训练的ImageNet模型进行PyTorch传输学习 - PyTorch transfer learning with pre-trained ImageNet model 到logits层的Tensorflow推理-适用于预训练模型上的批次 - Tensorflow inference upto logits layer - for batches on pre-trained model 如何使用TensorFlow Java API移除预训练模型的输出层? - How can I remove the output layer of pre-trained model with TensorFlow java api? 如何使用现有CNN模型中的预训练权重在Keras中进行迁移学习? - How can I use pre-trained weights from an existing CNN model for transfer learning in Keras? 元学习从 Tensorflow 中的预训练模型中找到最优 model - Meta-learning to find optimal model from pre-trained models in Tensorflow 如何将预训练的张量流模型加载并预测到Java代码中? - How to load and predict a pre-trained tensorflow model into Java code? TensorFlow 2.0 C++ - 加载预训练 model - TensorFlow 2.0 C++ - Load pre-trained model 无法加载 tensorflow BERT 预训练模型 - Failed to load tensorflow BERT pre-trained model
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM