如何在沒有 ImageNet 權重的情況下進行遷移學習？

Question

這是我的項目的描述：

Dataset1：更大的數據集，包含圖像的二進制類。

Dataset2 ：包含2個在外觀上與Dataset1非常相似的類。 我想制作一個 model 通過從Dataset1學習使用遷移學習，並在Dataset2中應用學習率較低的權重。

因此，我希望在 dataset1 上訓練整個VGG16 ，然后使用遷移學習來dataset2 dataset1最后一層。 我不想使用預訓練的 imagenet 數據庫。 這是我正在使用的代碼，我已經從中保存了 wights：


from tensorflow.keras.layers import Input, Lambda, Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
import numpy as np
from glob import glob
import matplotlib.pyplot as plt

vgg = VGG16(input_shape=(244, 244, 3), weights=None, include_top=False)

# don't train existing weights
for layer in vgg.layers:
    layer.trainable = False
    
x = Flatten()(vgg.output)   

import tensorflow.keras
prediction = tensorflow.keras.layers.Dense(2, activation='softmax')(x)

model = Model(inputs=vgg.input, outputs=prediction)

model.compile(
  loss='categorical_crossentropy',
  optimizer='adam',
  metrics=['accuracy']
)

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('chest_xray/train',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

test_set = train_datagen.flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

# fit the model
r = model.fit_generator(
  training_set,
  validation_data=test_set,
  epochs=5,
  steps_per_epoch=len(training_set),
  validation_steps=len(test_set)
)

model.save_weights('first_try.h5')

Answer 1

更新

根據您的查詢，似乎 class 編號在Dataset2中不會有所不同。 同時，您也不想使用圖像凈重。 因此，在這種情況下，您不需要 map 或存儲重量（如下所述）。 只需加載 model 並在Dataset2上進行加權和訓練。 凍結 Dataset1 中的所有訓練層並在Dataset2上訓練最后一層； 真的很直接。

在我的以下回復中，盡管您不需要完整的信息，但我仍將其保留以供將來參考。

這是您可能需要的一個小演示。 希望它能給你一些見解。 在這里，我們將訓練具有10類的CIRFAR數據集，並嘗試將其用於可能具有不同輸入大小和不同類數的不同數據集的遷移學習。

准備 CIFAR（10 節課）

import numpy as np
import tensorflow as tf 
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# train set / data 
x_train = x_train.astype('float32') / 255

# validation set / data 
x_test = x_test.astype('float32') / 255

# train set / target 
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
# validation set / target 
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)

print(x_train.shape, y_train.shape) 
print(x_test.shape, y_test.shape)  
'''
(50000, 32, 32, 3) (50000, 10)
(10000, 32, 32, 3) (10000, 10)
'''

Model

# declare input shape 
input = tf.keras.Input(shape=(32,32,3))
# Block 1
x = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")(input)
x = tf.keras.layers.MaxPooling2D(3)(x)

# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(x)

# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax')(gap)

# bind all
func_model = tf.keras.Model(input, output)

'''
Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                330       
=================================================================
Total params: 1,226
Trainable params: 1,226
Non-trainable params: 0
'''

運行 model 得到一些權重矩陣，如下所示：

# compile 
print('\nFunctional API')
func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
func_model.fit(x_train, y_train, batch_size=128, epochs=1)

遷移學習

讓我們將它用於MNIST 。 它也有10類，但為了需要不同數量的類，我們將從中創建even和odd類（ 2 個類）。 下面我們將如何准備這些數據集

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# train set / data 
x_train = np.expand_dims(x_train, axis=-1)
x_train = np.repeat(x_train, 3, axis=-1)
x_train = x_train.astype('float32') / 255
# train set / target 
y_train = tf.keras.utils.to_categorical((y_train % 2 == 0).astype(int), 
                                        num_classes=2)

# validation set / data 
x_test = np.expand_dims(x_test, axis=-1)
x_test = np.repeat(x_test, 3, axis=-1)
x_test = x_test.astype('float32') / 255
# validation set / target 

y_test = tf.keras.utils.to_categorical((y_test % 2 == 0).astype(int), 
                                       num_classes=2)

print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)  
'''
(60000, 28, 28, 3) (60000, 2)
(10000, 28, 28, 3) (10000, 2)
'''

如果您熟悉 keras model 中keras預訓練權重的用法，您可能會使用include_top 。 通過將其設置為False ，我們可以輕松加載沒有預訓練模型的頂級信息的權重文件。 所以在這里我們需要手動（有點）這樣做。 我們需要獲取權重矩陣，直到最后一個激活層（在我們的例子中是Dense(10, softmax) ）。 並將其放入基礎 model 的新實例中，然后我們添加一個新的分類器層（在我們的示例中為Dense(2, softmax) 。

for i, layer in enumerate(func_model.layers):
    print(i,'\t',layer.trainable,'\t  :',layer.name)

'''
  Train_Bool  : Layer Names
0    True     : input_1
1    True     : conv2d
2    True     : max_pooling2d
3    True     : global_max_pooling2d # < we go till here to grab the weight and biases
4    True     : dense  # 10 classes (from previous model)
'''

獲取權重

sparsified_weights = []
for w in func_model.get_layer(name='global_max_pooling2d').get_weights():
    sparsified_weights.append(w)

這樣，我們 map 的權重來自舊的 model，除了分類器層（ Dense ）。 請注意，這里我們抓取到GAP層的權重，它就在分類器之前。

現在，我們將創建一個新的 model，與舊的 model 相同，除了最后一層（ 10 Dense ），同時添加一個新的Dense有2單元。

predictions    = Dense(2, activation='softmax')(func_model.layers[-2].output)
new_func_model = Model(inputs=func_model.inputs, outputs = predictions)

現在我們可以為新的 model 設置權重如下：

new_func_model.get_layer(name='global_max_pooling2d').set_weights(sparsified_weights)

您可以檢查以驗證如下； 除了最后一層外，一切都將相同。

func_model.get_weights()      # last layer, Dense (10)
new_func_model.get_weights()  # last layer, Dense (2)

現在你可以用新的數據集訓練 model，在我們的例子中是MNIST

new_func_model.compile(optimizer='adam', loss='categorical_crossentropy')
new_func_model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d (Global (None, 32)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 66        
=================================================================
Total params: 962
Trainable params: 962
Non-trainable params: 0
'''

# compile 
print('\nFunctional API')
new_func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
new_func_model.fit(x_train, y_train, batch_size=128, epochs=1)

WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
469/469 [==============================] - 1s 3ms/step - loss: 0.6453 - categorical_accuracy: 0.6447
<tensorflow.python.keras.callbacks.History at 0x7f7af016feb8>

Answer 2

幾個問題。 您沒有使用 imagenet 權重（無法想象為什么不使用），然后您將 VGG 網絡的所有層設置為不可訓練。 因此，您將從隨機權重開始，並且保持隨機。 然后添加 Flatten 和預測層並嘗試訓練。 您將要訓練的只是一個密集層。 懷疑這會很好，但我想它會學到一些東西。 我會至少使用 imagenet 權重，我也更喜歡訓練整個 model 以獲得最佳結果。 然后下一個問題是代碼

test_set = train_datagen.flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')
#where
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

您不想在測試集上執行圖像增強，這就是您將要執行的操作，因此請使用

test_set =  ImageDataGenerator(rescale = 1./255).flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical'
                                                 shuffle=False)

接下來，您將使用 model.fit_generator，它現在可以工作，但在 tensorflow 的未來版本中會被貶值。 使用 model.fit 它現在適用於我認為從 tensorflow 1.5 開始的發電機。 您所說的測試集通常稱為驗證集，所以沒關系。 您已經指定了 batch_size=32。但是您在 model.fit 中有代碼

steps_per_epoch=len(training_set),
validation_steps=len(test_set)
# what you want is
steps_per epoch=len(training_set)//32
validation_steps=len(test_set)//32

一旦你訓練了你的 model，它就有了你希望用於 dataset2 的權重。 只需為 datset2 創建新的生成器並使用 model.fit 重新訓練它

如何在沒有 ImageNet 權重的情況下進行遷移學習？

問題描述

2 個解決方案

解決方案1
2 已采納 2020-12-04 03:53:12

更新

准備 CIFAR（10 節課）

Model

遷移學習

解決方案2
1 2020-12-04 05:28:13

如何在沒有 ImageNet 權重的情況下進行遷移學習？

問題描述

2 個解決方案

解決方案1 2 已采納 2020-12-04 03:53:12

更新

准備 CIFAR（10 節課）

Model

遷移學習

解決方案2 1 2020-12-04 05:28:13

解決方案1
2 已采納 2020-12-04 03:53:12

解決方案2
1 2020-12-04 05:28:13