Can you explain Keras get_weights() function in a Neural Network with BatchNormalization?

Question

When i run a Neural Network (without BatchNormalization) in Keras, I understand how the get_weights() function provides the weights and bias of the NN. However with BatchNorm it produces 4 extra parameters, I assume Gamma, Beta, Mean & Std.

I have tried to replicate a simple NN manually when i save these values, and cant get them to produce the right output. Does anyone know how these values work?

No Batch Norm

With Batch Norm

Answer 1

I will take an example to explain get_weights() in case of simple Multi Layer Perceptron (MLP) and MLP with Batch Normalization(BN).

Example: Say we are working on MNIST dataset, and using 2 layer MLP architecture (ie 2 hidden layers). No. of neurons in hidden layer 1 is 392 and No. of neurons in hidden layer 2 is 196. So the final architecture for our MLP will be 784 x 512 x 196 x 10

Here 784 is the input image dimension and 10 is the output layer dimension

Case1: MLP without Batch Normalization => Let my model name is model_relu that uses ReLU activation function. Now after training model_relu , I am using get_weights(), This will return a list of size 6 as shown in below screen shot.

get_weights() with simple MLP and without Batch Norm And the list values are as below:

(784, 392): weights for hidden layer1
(392,): bias associated with weights of hidden layer1
(392, 196): weights for hidden layer2
(196,): bias associated with weights of hidden layer2
(196, 10): weights for output layer
(10,): bias associated with weights of output layer

Case2: MLP with Batch Normalization => Let my model name is model_batch that also uses ReLU activation function along with Batch Normalization. Now after training model_batch I am using get_weights(), This will return a list of size 14 as shown in below screen shot.

get_weights() with Batch Norm And the list values are as below:

(784, 392): weight for hidden layer1
(392,): bias associated with weights of hidden layer1
(392,) (392,) (392,) (392,): these four parameters are gamma, beta, mean and std. dev values of size 392 each associated with Batch Normalization of hidden layer1.
(392, 196): weight for hidden layer2
(196,): bias associated with weights of hidden layer2
(196,) (196,) (196,) (196,): these four parameters are gamma, beta, running mean and std. dev of size 196 each associated with Batch Normalization of hidden layer2.
(196, 10): weight for output layer
(10,): bias associated with weights of output layer

So, in case2 if you want to get weights for hidden layer1, hidden layer2, and output layer, the python code can be something like this:

wrights = model_batch.get_weights()      
hidden_layer1_wt = wrights[0].flatten().reshape(-1,1)     
hidden_layer2_wt = wrights[6].flatten().reshape(-1,1)     
output_layer_wt = wrights[12].flatten().reshape(-1,1)

Hope this helps!

Ref: keras-BatchNormalization

Answer 2

The four values given is the gamma, beta, the moving_mean, and the moving standard deviation. You can check it inside the source code of keras

Can you explain Keras get_weights() function in a Neural Network with BatchNormalization?

Question

2 answers

solution1
1 2019-08-03 15:06:58

solution2
1 2021-11-01 03:40:33

Can you explain Keras get_weights() function in a Neural Network with BatchNormalization?

Question

2 answers

solution1 1 2019-08-03 15:06:58

solution2 1 2021-11-01 03:40:33

solution1
1 2019-08-03 15:06:58

solution2
1 2021-11-01 03:40:33