Setting a keras layer as not trainable after a compile changes the number of total parameters in the summary

Question

I would like to know how I should interpret the following results of the model summary of keras library. The results below are from keras version 2.3.1.

In keras, we can set layer's trainable attribute, so that its weights do not change during the training.

from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.summary()
print("***")

model.layers[0].trainable = False
model.summary()

Model: "sequential_36"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_101 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_102 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_36"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_101 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_102 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 6
Non-trainable params: 20

The result above is intuitive since I set the first layer as not trainable, we have less trainable parameters.

If I compile the model before changing the attribute (this is not standard but may happen in some applications), I get the following.

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()

Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_105 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_106 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_105 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_106 (Dense)            (None, 1)                 6         
=================================================================
Total params: 46
Trainable params: 26
Non-trainable params: 20

This says there are more parameters than before. Can someone clarify how these numbers should be interpreted?

[EDIT]

From the answers received, this seems to be a bug feature and the behavior depends on the package version. Here is another example I obtain from the tensorflow keras API. Unlike @lukasz-tracewski's answer, I still obtain the same number of parameters with a different warning message. Perhaps the versions are slightly different?

import tensorflow as tf
print("tensorflow version is", tf.__version__)
print("keras version is", tf.keras.__version__)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()

tensorflow version is 2.1.0
keras version is 2.2.4-tf
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 5)                 20        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 5)                 20        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 6         
=================================================================
WARNING:tensorflow:Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
Total params: 46
Trainable params: 26
Non-trainable params: 20

Answer 1

It's a bug-feature in keras, here's the issue . As you can see from the comment, it was resolved by just putting a warning that there's an inconsistency:

UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
  'Discrepancy between trainable weights and collected trainable'
Total params: 46
Trainable params: 26
Non-trainable params: 20

You might miss this warning if you're using some Jupyter-like solutions (that sometimes eat warning messages).

In short, it's caused by the fact that the summary method will check the compiled model and then separately the non-trainable parameters. That's why you get 26 (from all trainable params in the compiled model) + 20 (from non-trainable attribute checked later).

Tensorflow keras API does not have this bug-feature.

[EDIT]

Since there could be some confusion between Tensorflow with Keras API and Keras with Tensorflow backend, below you will find code for the former. It's almost identical to what OP provided, only imports differ.

from tensorflow.keras.layers import Dense
from tensorflow.keras import Sequential

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()

Output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_2 (Dense)              (None, 5)                 20        
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_2 (Dense)              (None, 5)                 20        
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 6
Non-trainable params: 20

Note lack of warning.

My keras version is 2.3.1 , while Tensorflow is 2.2 .

Answer 2

I'm using TF 2.2

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()

summary:

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 5)                 20        
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 5)                 20        
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 6
Non-trainable params: 20

Setting a keras layer as not trainable after a compile changes the number of total parameters in the summary

Question

2 answers

solution1
1 ACCPTED 2020-05-11 07:59:26

solution2
1 2020-05-11 08:00:49

Setting a keras layer as not trainable after a compile changes the number of total parameters in the summary

Question

2 answers

solution1 1 ACCPTED 2020-05-11 07:59:26

solution2 1 2020-05-11 08:00:49

solution1
1 ACCPTED 2020-05-11 07:59:26

solution2
1 2020-05-11 08:00:49