I have 2 separate trained models that independently predict unrelated regression values from the same image. Both models use a pretrained VGG16 model as the base with specific top layers added.
When tested individually, both models perform well. When I concatenate the 2 pretrained models I get different predictions than running independently.
I declare the individual models as so:
# VGG
vggModel = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape=(310, 765, 3))
vggModel.trainable = True
trainableFlag = False
for layer in vggModel.layers:
if layer.name == 'block5_conv1':
trainableFlag = True
layer.trainable = trainableFlag
# Model A
model_a = tf.keras.Sequential(name='model_a')
model_a.add(vggModel)
model_a.add(tf.keras.layers.Flatten())
model_a.add(tf.keras.layers.Dropout(0.1))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(1, activation='linear'))
model_a.load_weights(model_a_wts)
# Model B
model_b = tf.keras.Sequential(name='model_b')
model_b.add(vggModel)
model_b.add(tf.keras.layers.Flatten())
model_b.add(tf.keras.layers.Dropout(0.1))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(1, activation='linear'))
model_b.load_weights(model_b_wts)
And then I concatenate the models by:
common_input = tf.keras.Input(shape=(310, 765, 3))
a_out = model_a(common_input)
b_out = model_b(common_input)
concatOut = tf.keras.layers.Concatenate()([a_out, b_out])
branched_model = tf.keras.Model(common_input, concatOut, name='Branched')
How is it possible to get different predictions in this scenario?
Based on @hkyi 's comment the answer is:
The 2 models are not fully independent as they share vggModel (which is trainable). Therefore you need to clone the vggModel before adding it to model_b. Otherwise the weights loaded to model_b will overwrite the vggModel weights used in model_a.
Use tf.keras.models.clone_model(vggModel) instead:
# VGG
vggModel = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape=(310, 765, 3))
vggModel.trainable = True
trainableFlag = False
for layer in vggModel.layers:
if layer.name == 'block5_conv1':
trainableFlag = True
layer.trainable = trainableFlag
# Model A
model_a = tf.keras.Sequential(name='model_a')
model_a.add(vggModel)
model_a.add(tf.keras.layers.Flatten())
model_a.add(tf.keras.layers.Dropout(0.1))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(1, activation='linear'))
model_a.load_weights(model_a_wts)
# Model B
model_b = tf.keras.Sequential(name='model_b')
model_b.add(tf.keras.models.clone_model(vggModel)) <--- HERE IS THE CHANGE REQUIRED
model_b.add(tf.keras.layers.Flatten())
model_b.add(tf.keras.layers.Dropout(0.1))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(1, activation='linear'))
model_b.load_weights(model_b_wts)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.