简体   繁体   中英

I froze all keras layers, yet the model changes when using fit_generator

I am trying to use a fine-tuning approach to retrain a model. As a sanity check I tried to retrain it while first freezing all of its the layers. I expected that the model will not change; I was surprised to see this:

Epoch 1/50
16/16 [==============================] - 25s - loss: 4.0006 - acc: 0.5000 - val_loss: 1.3748e-04 - val_acc: 1.0000
Epoch 2/50
16/16 [==============================] - 24s - loss: 3.8861 - acc: 0.5000 - val_loss: 1.7333e-04 - val_acc: 1.0000
Epoch 3/50
16/16 [==============================] - 25s - loss: 3.9560 - acc: 0.5000 - val_loss: 3.0870e-04 - val_acc: 1.0000
Epoch 4/50
16/16 [==============================] - 26s - loss: 3.9730 - acc: 0.5000 - val_loss: 7.5931e-04 - val_acc: 1.0000
Epoch 5/50
16/16 [==============================] - 26s - loss: 3.7195 - acc: 0.5000 - val_loss: 0.0021 - val_acc: 1.0000
Epoch 6/50
16/16 [==============================] - 25s - loss: 3.9514 - acc: 0.5000 - val_loss: 0.0058 - val_acc: 1.0000
Epoch 7/50
16/16 [==============================] - 26s - loss: 3.9459 - acc: 0.5000 - val_loss: 0.0180 - val_acc: 1.0000
Epoch 8/50
16/16 [==============================] - 26s - loss: 3.8744 - acc: 0.5000 - val_loss: 0.0489 - val_acc: 1.0000
Epoch 9/50
16/16 [==============================] - 27s - loss: 3.8914 - acc: 0.5000 - val_loss: 0.1100 - val_acc: 1.0000
Epoch 10/50
16/16 [==============================] - 26s - loss: 4.0585 - acc: 0.5000 - val_loss: 0.2092 - val_acc: 0.7500
Epoch 11/50
16/16 [==============================] - 27s - loss: 4.0232 - acc: 0.5000 - val_loss: 0.3425 - val_acc: 0.7500
Epoch 12/50
16/16 [==============================] - 25s - loss: 3.9073 - acc: 0.5000 - val_loss: 0.4566 - val_acc: 0.7500
Epoch 13/50
16/16 [==============================] - 27s - loss: 4.1036 - acc: 0.5000 - val_loss: 0.5454 - val_acc: 0.7500
Epoch 14/50
16/16 [==============================] - 26s - loss: 3.7854 - acc: 0.5000 - val_loss: 0.6213 - val_acc: 0.7500
Epoch 15/50
16/16 [==============================] - 27s - loss: 3.7907 - acc: 0.5000 - val_loss: 0.7120 - val_acc: 0.7500
Epoch 16/50
16/16 [==============================] - 27s - loss: 4.0540 - acc: 0.5000 - val_loss: 0.7226 - val_acc: 0.7500
Epoch 17/50
16/16 [==============================] - 26s - loss: 3.8669 - acc: 0.5000 - val_loss: 0.8032 - val_acc: 0.7500
Epoch 18/50
16/16 [==============================] - 28s - loss: 3.9834 - acc: 0.5000 - val_loss: 0.9523 - val_acc: 0.7500
Epoch 19/50
16/16 [==============================] - 27s - loss: 3.9495 - acc: 0.5000 - val_loss: 2.5764 - val_acc: 0.6250
Epoch 20/50
16/16 [==============================] - 25s - loss: 3.7534 - acc: 0.5000 - val_loss: 3.0939 - val_acc: 0.6250
Epoch 21/50
16/16 [==============================] - 29s - loss: 3.8447 - acc: 0.5000 - val_loss: 3.0467 - val_acc: 0.6250
Epoch 22/50
16/16 [==============================] - 28s - loss: 4.0613 - acc: 0.5000 - val_loss: 3.2160 - val_acc: 0.6250
Epoch 23/50
16/16 [==============================] - 28s - loss: 4.1428 - acc: 0.5000 - val_loss: 3.8793 - val_acc: 0.6250
Epoch 24/50
16/16 [==============================] - 27s - loss: 3.7868 - acc: 0.5000 - val_loss: 4.1935 - val_acc: 0.6250
Epoch 25/50
16/16 [==============================] - 28s - loss: 3.8437 - acc: 0.5000 - val_loss: 4.5031 - val_acc: 0.6250
Epoch 26/50
16/16 [==============================] - 28s - loss: 3.9798 - acc: 0.5000 - val_loss: 4.5121 - val_acc: 0.6250
Epoch 27/50
16/16 [==============================] - 28s - loss: 3.8727 - acc: 0.5000 - val_loss: 4.5341 - val_acc: 0.6250
Epoch 28/50
16/16 [==============================] - 28s - loss: 3.8343 - acc: 0.5000 - val_loss: 4.5198 - val_acc: 0.6250
Epoch 29/50
16/16 [==============================] - 28s - loss: 4.2144 - acc: 0.5000 - val_loss: 4.5341 - val_acc: 0.6250
Epoch 30/50
16/16 [==============================] - 28s - loss: 3.8348 - acc: 0.5000 - val_loss: 4.5684 - val_acc: 0.6250

This is the code I used:

from keras import backend as K
import inception_v4
import numpy as np
import cv2
import os

import re

from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input

from keras.models import Model
os.environ['CUDA_VISIBLE_DEVICES'] = ''


    

v4 = inception_v4.create_model(weights='imagenet')


#v4.summary()
my_batch_size=1
train_data_dir ='//shared_directory/projects/try_CDFxx/data/train/'
validation_data_dir ='//shared_directory/projects/try_CDFxx/data/validation/'
top_model_weights_path= 'bottleneck_fc_model.h5'
class_num=2

img_width, img_height = 299, 299
nbr_train_samples=16
nbr_validation_samples=8
num_classes=2
nb_epoch=50

main_input= v4.layers[1].input
main_output=v4.layers[-1].output
flatten_output= v4.layers[-2].output


BN_model = Model(input=[main_input], output=[main_output, flatten_output])





### DEF
train_datagen = ImageDataGenerator(
            rescale=1./255,
            shear_range=0.1,
            zoom_range=0.1,
            rotation_range=10.,
            width_shift_range=0.1,
            height_shift_range=0.1,
            horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

    
    
train_generator = train_datagen.flow_from_directory(
            train_data_dir,
            target_size = (img_width, img_height),
            batch_size = my_batch_size,
            shuffle = True,
            class_mode = 'categorical')

validation_generator = val_datagen.flow_from_directory(
            validation_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            shuffle = True,
            class_mode = 'categorical') # sparse


###

def save_BN(BN_model):   # but we will need to get the get_processed_image into it!!!!
#   
    datagen = ImageDataGenerator(rescale=1./255) # here!
#   
    generator = datagen.flow_from_directory(
            train_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            class_mode='categorical',
            shuffle=False)
    nb_train_samples = generator.classes.size       
    bottleneck_features_train = BN_model.predict_generator(generator, nb_train_samples)
#
    np.save(open('bottleneck_flat_features_train.npy', 'wb'), bottleneck_features_train[1])

    np.save(open('bottleneck_train_labels.npy', 'wb'), generator.classes)
    #   generator is probably a tuple - and the second thing in it is a label! OKAY, its not :(
    generator = datagen.flow_from_directory(
            validation_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            class_mode='categorical',
            shuffle=False)
            
    nb_validation_samples = generator.classes.size
    bottleneck_features_validation = BN_model.predict_generator(generator, nb_validation_samples)
    #bottleneck_features_validation = model.train_generator(generator, nb_validation_samples)
#
    np.save(open('bottleneck_flat_features_validation.npy', 'wb'), bottleneck_features_validation[1])

    np.save(open('bottleneck_validation_labels.npy', 'wb'), generator.classes)
    
    

def train_top_model ():
    train_data = np.load(open('bottleneck_flat_features_train.npy'))
    train_labels = np.load(open('bottleneck_train_labels.npy'))
#
    validation_data = np.load(open('bottleneck_flat_features_validation.npy'))
    validation_labels = np.load(open('bottleneck_validation_labels.npy'))
    #
    top_m  = Sequential()
    top_m.add(Dense(class_num,input_shape=train_data.shape[1:], activation='softmax', name='top_dense1'))
    top_m.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
#
    top_m.fit(train_data, train_labels,
    nb_epoch=nb_epoch, batch_size=my_batch_size,
    validation_data=(validation_data, validation_labels))
#
#
    #top_m.save_weights (top_model_weights_path)
#   validation_data[0]
#   train_data[0]
    Dense_layer=top_m.layers[-1]
    top_layer_weights=Dense_layer.get_weights()
    np.save(open('retrained_top_layer_weight.npy', 'wb'), top_layer_weights)


def fine_tune_model (): 

    predictions = Flatten()(v4.layers[-3].output)
    predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(predictions)
    main_input= v4.layers[1].input
    main_output=predictions
    FT_model = Model(input=[main_input], output=[main_output])

    top_layer_weights = np.load(open('retrained_top_layer_weight.npy'))
    Dense_layer=FT_model.layers[-1]
    Dense_layer.set_weights(top_layer_weights)
    
    for layer in FT_model.layers:
        layer.trainable = False 
#   FT_model.layers[-1].trainable=True

    FT_model.compile(optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])

    
    FT_model.fit_generator(
            train_generator,
            samples_per_epoch = nbr_train_samples,
            nb_epoch = nb_epoch,
            validation_data = validation_generator,
            nb_val_samples = nbr_validation_samples)    

########################################################
            ###########


save_BN(BN_model)
train_top_model()

fine_tune_model()

Thanks.

PS I'm using keras 1.

您正在使用dropout因此由于关闭了不同的单元,因此不同运行的指标可能会有所不同。

The training changes are normal because you are using image data augmentation, so every dataset per epoch will be different. For freezing all the layers try to change to False the trainable argument of the model directly:

FT_model.trainable = False
print('This is the number of trainable weights ''after freezing the conv base:', len(FT_model.trainable_weights))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM