加載經過訓練的 Keras 模型並繼續訓練

Question

我想知道是否可以保存部分訓練的 Keras 模型並在再次加載模型后繼續訓練。

這樣做的原因是我將來會有更多的訓練數據，我不想再次重新訓練整個模型。

我正在使用的功能是：

#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)

#Save partly trained model
model.save('partly_trained.h5')

#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')

#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

編輯 1：添加了完整的工作示例

使用 10 個 epoch 后的第一個數據集，最后一個 epoch 的損失將為 0.0748，准確度為 0.9863。

保存、刪除和重新加載模型后，在第二個數據集上訓練的模型的損失和准確率將分別為 0.1711 和 0.9504。

這是由新的訓練數據引起的還是由完全重新訓練的模型引起的？

"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

if __name__ == '__main__':
    # load data
    (X_train, y_train), (X_test, y_test) = mnist.load_data()

    # flatten 28*28 images to a 784 vector for each image
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
    # normalize inputs from 0-255 to 0-1
    X_train = X_train / 255
    X_test = X_test / 255
    # one hot encode outputs
    y_train = np_utils.to_categorical(y_train)
    y_test = np_utils.to_categorical(y_test)
    num_classes = y_test.shape[1]

    # build the model
    model = baseline_model()

    #Partly train model
    dataset1_x = X_train[:3000]
    dataset1_y = y_train[:3000]
    model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)

    # Final evaluation of the model
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

    #Save partly trained model
    model.save('partly_trained.h5')
    del model

    #Reload model
    model = load_model('partly_trained.h5')

    #Continue training
    dataset2_x = X_train[3000:]
    dataset2_y = y_train[3000:]
    model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

編輯 2：tensorflow.keras 備注

對於 tensorflow.keras，將參數 nb_epochs 更改為模型擬合中的 epochs。 導入和 basemodel 函數是：

import numpy
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import load_model


numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Answer 1

實際上 - model.save保存了在您的情況下重新開始訓練所需的所有信息。 唯一可能被重新加載模型破壞的是您的優化器狀態。 要檢查這一點 - 嘗試save並重新加載模型並在訓練數據上對其進行訓練。

Answer 2

上述大多數答案都涵蓋了要點。 如果您使用的是最新的 Tensorflow（ TF2.1或更高版本），那么以下示例將對您有所幫助。 代碼的模型部分來自 Tensorflow 網站。

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

請將模型保存為 *.tf 格式。 根據我的經驗，如果您定義了任何 custom_loss，*.h5 格式將不會保存優化器狀態，因此如果您想從我們離開的地方重新訓練模型，則不會達到您的目的。

# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')


# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')

# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

這種方法將在保存模型之前從我們離開的地方重新開始訓練。 正如其他人所提到的，如果您想保存最佳模型的權重，或者您想在每個時期保存模型的權重，您需要使用帶有save_weights_only=True 、 save_freq='epoch'和save_best_only等選項的 keras 回調函數（ModelCheckpoint） .

有關更多詳細信息，請查看此處和此處的另一個示例。

Answer 3

問題可能是您使用了不同的優化器 - 或優化器的不同參數。 我剛剛對自定義預訓練模型遇到了同樣的問題，使用

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                              patience=patience, min_lr=min_lr, verbose=1)

對於預訓練模型，原始學習率從 0.0003 開始，在預訓練期間降低到 min_learning 率，即 0.000003

我只是將該行復制到使用預訓練模型的腳本中，並且得到了非常糟糕的准確性。 直到我注意到預訓練模型的最后一個學習率是最小學習率，即 0.000003。 如果我從那個學習率開始，我會得到與預訓練模型的輸出完全相同的精度——這是有道理的，因為從一個比預訓練中使用的最后一個學習率大 100 倍的學習率開始模型將導致 GD 的巨大超調，從而嚴重降低精度。

Answer 4

請注意，Keras 有時會在加載模型時出現問題，如這里。 這可能解釋了您沒有從相同的訓練准確性開始的情況。

Answer 5

以上所有幫助，您必須在保存模型和權重時從與 LR 相同的學習率（）恢復。 直接在優化器上設置。

請注意，不能保證從那里得到改進，因為模型可能已經達到局部最小值，這可能是全局的。 除非您打算以受控方式增加學習率並將模型推到不遠處可能更好的最小值，否則恢復模型以搜索另一個局部最小值是沒有意義的。

Answer 6

您可能還會遇到概念漂移，請參閱當有新的觀察結果可用時，您是否應該重新訓練模型。 還有一堆學術論文討論的災難性遺忘的概念。 這是 MNIST對災難性遺忘的實證調查

Answer 7

如果您使用的是 TF2，請使用新的 saved_model 方法（格式 pb）。 更多信息可在此處和此處獲得。

model.fit(x=X_train, y=y_train, epochs=10,callbacks=[model_callback])#your first training
tf.saved_model.save(model, save_to_dir_path) #save the model
del model #to delete the model
model =  tf.keras.models.load_model(save_to_dir_path)
model.fit(x=X_train, y=y_train, epochs=10,callbacks=[model_callback])#your second training

Answer 8

使用保存的模型訓練模型是完全可以的。 我用相同的數據訓練了保存的模型，發現它給出了很好的准確性。 此外，每個時代所花費的時間都非常少。

這是代碼看看：

from keras.models import load_model
model = load_model('/content/drive/MyDrive/CustomResNet/saved_models/model_1.h5')
history=model.fit(train_gen,validation_data=valid_gen,epochs=5)

加載經過訓練的 Keras 模型並繼續訓練

問題描述

8 個解決方案

解決方案1
45 已采納 2017-03-08 11:45:28

解決方案2
28 2020-04-06 05:37:38

解決方案3
10 2017-12-28 15:53:37

解決方案4
3 2017-07-26 08:42:45

解決方案5
1 2018-05-29 18:55:32

解決方案6
1 2018-12-10 00:03:06

解決方案7
1 2021-01-30 19:49:05

解決方案8
-1 2022-06-13 03:47:29

加載經過訓練的 Keras 模型並繼續訓練

問題描述

8 個解決方案

解決方案1 45 已采納 2017-03-08 11:45:28

解決方案2 28 2020-04-06 05:37:38

解決方案3 10 2017-12-28 15:53:37

解決方案4 3 2017-07-26 08:42:45

解決方案5 1 2018-05-29 18:55:32

解決方案6 1 2018-12-10 00:03:06

解決方案7 1 2021-01-30 19:49:05

解決方案8 -1 2022-06-13 03:47:29

解決方案1
45 已采納 2017-03-08 11:45:28

解決方案2
28 2020-04-06 05:37:38

解決方案3
10 2017-12-28 15:53:37

解決方案4
3 2017-07-26 08:42:45

解決方案5
1 2018-05-29 18:55:32

解決方案6
1 2018-12-10 00:03:06

解決方案7
1 2021-01-30 19:49:05

解決方案8
-1 2022-06-13 03:47:29