When i run a Neural Network (without BatchNormalization) in Keras, I understand how the get_weights() function provides the weights and bias of the NN. However with BatchNorm it produces 4 extra parameters, I assume Gamma, Beta, Mean & Std.
I have tried to replicate a simple NN manually when i save these values, and cant get them to produce the right output. Does anyone know how these values work?
I will take an example to explain get_weights() in case of simple Multi Layer Perceptron (MLP) and MLP with Batch Normalization(BN).
Example: Say we are working on MNIST dataset, and using 2 layer MLP architecture (ie 2 hidden layers). No. of neurons in hidden layer 1 is 392 and No. of neurons in hidden layer 2 is 196. So the final architecture for our MLP will be 784 x 512 x 196 x 10
Here 784 is the input image dimension and 10 is the output layer dimension
Case1: MLP without Batch Normalization => Let my model name is model_relu that uses ReLU activation function. Now after training model_relu , I am using get_weights(), This will return a list of size 6 as shown in below screen shot.
get_weights() with simple MLP and without Batch Norm And the list values are as below:
(392,): bias associated with weights of hidden layer1
(392, 196): weights for hidden layer2
(196,): bias associated with weights of hidden layer2
(196, 10): weights for output layer
Case2: MLP with Batch Normalization => Let my model name is model_batch that also uses ReLU activation function along with Batch Normalization. Now after training model_batch I am using get_weights(), This will return a list of size 14 as shown in below screen shot.
get_weights() with Batch Norm And the list values are as below:
(392,) (392,) (392,) (392,): these four parameters are gamma, beta, mean and std. dev values of size 392 each associated with Batch Normalization of hidden layer1.
(392, 196): weight for hidden layer2
(196,) (196,) (196,) (196,): these four parameters are gamma, beta, running mean and std. dev of size 196 each associated with Batch Normalization of hidden layer2.
(196, 10): weight for output layer
So, in case2 if you want to get weights for hidden layer1, hidden layer2, and output layer, the python code can be something like this:
wrights = model_batch.get_weights()
hidden_layer1_wt = wrights[0].flatten().reshape(-1,1)
hidden_layer2_wt = wrights[6].flatten().reshape(-1,1)
output_layer_wt = wrights[12].flatten().reshape(-1,1)
Hope this helps!
The four values given is the gamma, beta, the moving_mean, and the moving standard deviation. You can check it inside the source code of keras
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.