简体   繁体   English

如何在没有 ImageNet 权重的情况下进行迁移学习?

[英]How to do Transfer Learning without ImageNet weights?

This is a description of my project:这是我的项目的描述:

Dataset1: The bigger dataset, contains binary classes of images. Dataset1:更大的数据集,包含图像的二进制类。

Dataset2 : Contains 2 classes that are very similar in appearance to Dataset1 . Dataset2 :包含2个在外观上与Dataset1非常相似的类。 I want to make a model that is using transfer learning by learning from Dataset1 and apply the weights with less learning rate in Dataset2 .我想制作一个 model 通过从Dataset1学习使用迁移学习,并在Dataset2中应用学习率较低的权重。

Therefore I'm looking to train the entire VGG16 on dataset1 , then using transfer learning to finetune the last layers for dataset2 .因此,我希望在 dataset1 上训练整个VGG16 ,然后使用迁移学习来dataset2 dataset1最后一层。 I do not want to use the pre-trained imagenet database.我不想使用预训练的 imagenet 数据库。 This is the code I am using, and I have saved the wights from it:这是我正在使用的代码,我已经从中保存了 wights:


from tensorflow.keras.layers import Input, Lambda, Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
import numpy as np
from glob import glob
import matplotlib.pyplot as plt

vgg = VGG16(input_shape=(244, 244, 3), weights=None, include_top=False)

# don't train existing weights
for layer in vgg.layers:
    layer.trainable = False
    
x = Flatten()(vgg.output)   

import tensorflow.keras
prediction = tensorflow.keras.layers.Dense(2, activation='softmax')(x)

model = Model(inputs=vgg.input, outputs=prediction)

model.compile(
  loss='categorical_crossentropy',
  optimizer='adam',
  metrics=['accuracy']
)

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('chest_xray/train',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

test_set = train_datagen.flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

# fit the model
r = model.fit_generator(
  training_set,
  validation_data=test_set,
  epochs=5,
  steps_per_epoch=len(training_set),
  validation_steps=len(test_set)
)

model.save_weights('first_try.h5') 

Update更新

Based on your query, it seems that the class number won't be different in Dataset2 .根据您的查询,似乎 class 编号在Dataset2中不会有所不同。 At the same time, you also don't want to use image net weight.同时,您也不想使用图像净重。 So, in that case, you don't need to map or store the weight (as described below).因此,在这种情况下,您不需要 map 或存储重量(如下所述)。 Just load the model and weight and train on Dataset2 .只需加载 model 并在Dataset2上进行加权和训练。 Freeze the all trained layer from Dataset1 and train the last layer on Dataset2 ;冻结 Dataset1 中的所有训练层并在Dataset2训练最后一层; really straight forward.真的很直接。

In my below response, though you're not needed the full information, I am keeping that anyway for future reference.在我的以下回复中,尽管您不需要完整的信息,但我仍将其保留以供将来参考。


Here is a small demonstration of what you probably need.这是您可能需要的一个小演示。 Hope it gives you some insight.希望它能给你一些见解。 Here we will train the CIRFAR data set which has 10 classes and try to use it for transfer learning with on different data set which probably has different input sizes and a different number of classes.在这里,我们将训练具有10类的CIRFAR数据集,并尝试将其用于可能具有不同输入大小和不同类数的不同数据集的迁移学习。

Preparing the CIFAR (10 Classes)准备 CIFAR(10 节课)

import numpy as np
import tensorflow as tf 
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# train set / data 
x_train = x_train.astype('float32') / 255

# validation set / data 
x_test = x_test.astype('float32') / 255

# train set / target 
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
# validation set / target 
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)

print(x_train.shape, y_train.shape) 
print(x_test.shape, y_test.shape)  
'''
(50000, 32, 32, 3) (50000, 10)
(10000, 32, 32, 3) (10000, 10)
'''

Model Model

# declare input shape 
input = tf.keras.Input(shape=(32,32,3))
# Block 1
x = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")(input)
x = tf.keras.layers.MaxPooling2D(3)(x)

# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(x)

# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax')(gap)

# bind all
func_model = tf.keras.Model(input, output)

'''
Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                330       
=================================================================
Total params: 1,226
Trainable params: 1,226
Non-trainable params: 0
'''

Run the model to get some weight matrices as follows:运行 model 得到一些权重矩阵,如下所示:

# compile 
print('\nFunctional API')
func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
func_model.fit(x_train, y_train, batch_size=128, epochs=1)

Transfer Learning迁移学习

Let's use it for MNIST .让我们将它用于MNIST It also has 10 classes but for sake of need a different number of classes, we will make even and odd categories from it ( 2 classes).它也有10类,但为了需要不同数量的类,我们将从中创建evenodd类( 2 个类)。 Below how we will prepare these data sets下面我们将如何准备这些数据集

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# train set / data 
x_train = np.expand_dims(x_train, axis=-1)
x_train = np.repeat(x_train, 3, axis=-1)
x_train = x_train.astype('float32') / 255
# train set / target 
y_train = tf.keras.utils.to_categorical((y_train % 2 == 0).astype(int), 
                                        num_classes=2)

# validation set / data 
x_test = np.expand_dims(x_test, axis=-1)
x_test = np.repeat(x_test, 3, axis=-1)
x_test = x_test.astype('float32') / 255
# validation set / target 

y_test = tf.keras.utils.to_categorical((y_test % 2 == 0).astype(int), 
                                       num_classes=2)

print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)  
'''
(60000, 28, 28, 3) (60000, 2)
(10000, 28, 28, 3) (10000, 2)
'''

If you're familiar with the usage of ImageNet pretrained weight in the keras model, you probably use include_top .如果您熟悉 keras model 中keras预训练权重的用法,您可能会使用include_top By setting it False we can easily load a weight file that has no top information of the pretrained models.通过将其设置为False ,我们可以轻松加载没有预训练模型的顶级信息的权重文件。 So here we need to manually (kinda) do that.所以在这里我们需要手动(有点)这样做。 We need to grab the weight matrices until the last activation layer (in our case which is Dense(10, softmax) ).我们需要获取权重矩阵,直到最后一个激活层(在我们的例子中是Dense(10, softmax) )。 And put it in the new instance of the base model, and + we add a new classifier layer (in our case that will be Dense(2, softmax) .并将其放入基础 model 的新实例中,然后我们添加一个新的分类器层(在我们的示例中为Dense(2, softmax)

for i, layer in enumerate(func_model.layers):
    print(i,'\t',layer.trainable,'\t  :',layer.name)

'''
  Train_Bool  : Layer Names
0    True     : input_1
1    True     : conv2d
2    True     : max_pooling2d
3    True     : global_max_pooling2d # < we go till here to grab the weight and biases
4    True     : dense  # 10 classes (from previous model)
'''

Get Weights获取权重

sparsified_weights = []
for w in func_model.get_layer(name='global_max_pooling2d').get_weights():
    sparsified_weights.append(w)

By that, we map the weight from the old model except for the classifier layers ( Dense ).这样,我们 map 的权重来自旧的 model,除了分类器层( Dense )。 Please note, here we grab the weight until the GAP layer, which is there right before the classifier.请注意,这里我们抓取到GAP层的权重,它就在分类器之前。

Now, we will create a new model, the same as the old model except for the last layer ( 10 Dense ), and at the same time add a new Dense with 2 unit.现在,我们将创建一个新的 model,与旧的 model 相同,除了最后一层( 10 Dense ),同时添加一个新的Dense2单元。

predictions    = Dense(2, activation='softmax')(func_model.layers[-2].output)
new_func_model = Model(inputs=func_model.inputs, outputs = predictions) 

And now we can set weight as follows to the new model:现在我们可以为新的 model 设置权重如下:

new_func_model.get_layer(name='global_max_pooling2d').set_weights(sparsified_weights)

You can check to verify as follows;您可以检查以验证如下; all will be the same except the last layer.除了最后一层外,一切都将相同。

func_model.get_weights()      # last layer, Dense (10)
new_func_model.get_weights()  # last layer, Dense (2)

Now you can train the model with new data set, in our case which was MNIST现在你可以用新的数据集训练 model,在我们的例子中是MNIST

new_func_model.compile(optimizer='adam', loss='categorical_crossentropy')
new_func_model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d (Global (None, 32)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 66        
=================================================================
Total params: 962
Trainable params: 962
Non-trainable params: 0
'''

# compile 
print('\nFunctional API')
new_func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
new_func_model.fit(x_train, y_train, batch_size=128, epochs=1)
WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
469/469 [==============================] - 1s 3ms/step - loss: 0.6453 - categorical_accuracy: 0.6447
<tensorflow.python.keras.callbacks.History at 0x7f7af016feb8>

couple of issues.几个问题。 You are not using the imagenet weights (can't imagine why not ) then you set all the layer of the VGG network as untrainable.您没有使用 imagenet 权重(无法想象为什么不使用),然后您将 VGG 网络的所有层设置为不可训练。 So you will start out with random weights and stay will stay random.因此,您将从随机权重开始,并且保持随机。 Then you add the Flatten and prediction layers and try to train.然后添加 Flatten 和预测层并尝试训练。 All you will be training is a single dense layer.您将要训练的只是一个密集层。 Doubt that will work very well but I guess it will learn something.怀疑这会很好,但我想它会学到一些东西。 I would use the imagenet weights as a minimum, I also prefer to train the entire model to get the best results.我会至少使用 imagenet 权重,我也更喜欢训练整个 model 以获得最佳结果。 Then next issue is the code然后下一个问题是代码

test_set = train_datagen.flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')
#where
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

You do not want to perform image augmentation on a test set which is what you will be doing so use您不想在测试集上执行图像增强,这就是您将要执行的操作,因此请使用

test_set =  ImageDataGenerator(rescale = 1./255).flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical'
                                                 shuffle=False)

Next you are using model.fit_generator, that will work for now but is being depreciated in future versions of tensorflow.接下来,您将使用 model.fit_generator,它现在可以工作,但在 tensorflow 的未来版本中会被贬值。 Use model.fit it now works with generators I think starting with tensorflow 1.5.使用 model.fit 它现在适用于我认为从 tensorflow 1.5 开始的发电机。 What you call the test set is normally called the validation set so that's OK.您所说的测试集通常称为验证集,所以没关系。 You have specified a batch_size=32.However you have the code in model.fit您已经指定了 batch_size=32。但是您在 model.fit 中有代码

steps_per_epoch=len(training_set),
validation_steps=len(test_set)
# what you want is
steps_per epoch=len(training_set)//32
validation_steps=len(test_set)//32

Once you train your model it has the weights you desire to use for dataset2.一旦你训练了你的 model,它就有了你希望用于 dataset2 的权重。 Just create new generators for datset2 and retrain it with model.fit只需为 datset2 创建新的生成器并使用 model.fit 重新训练它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过部分加载权重进行迁移学习 - Transfer learning by loading weights partially 迁移学习后如何查看预训练模型的权重? - How to view weights of a pretrained model after transfer learning? 使用预先训练的ImageNet模型进行PyTorch传输学习 - PyTorch transfer learning with pre-trained ImageNet model 如何在NiftyNet中实施转学? - How do I implement transfer learning in NiftyNet? 如何在 AlexNet 的 Keras 训练之前加载 imagenet 权重? - How to load imagenet weights before Training in Keras for AlexNet? 如何通过强化学习更新函数逼近中的权重? - How do you update the weights in function approximation with reinforcement learning? 转学习中的问题 - 无法使用keras传递权重 - Problem in transfer learning - couldn't transfer weights using keras 如何使用现有CNN模型中的预训练权重在Keras中进行迁移学习? - How can I use pre-trained weights from an existing CNN model for transfer learning in Keras? Package 训练神经网络的 model 权重,使其可用于迁移学习 - Package the model weights of a trained neural network to make it usable for transfer learning 如何使用来自一个预先训练的MLP的最后一个隐藏层权重作为Keras的新MLP(转移学习)的输入? - How to use the last hidden layer weights from one pre-trained MLP as input to a new MLP (transfer learning) with Keras?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM