I'm trying to understand in depth the MuseGAN network that is presented in David Foster's book Generative Deep Learning. (Whch I highly recommend.) One thing I'm stuck on is understanding the number of trainable parameters in the “Temporal Network” which is a sub-network of the generator in the GAN.
Here's the code:
def TemporalNetwork(self):
input_layer = Input(shape=(self.z_dim,), name='temporal_input')
x = Reshape([1,1,self.z_dim])(input_layer)
x = self.conv_t(x, f=1024, k=(2,1), s=(1,1), a= 'relu', p = 'valid', bn = True)
x = self.conv_t(x, f=self.z_dim, k=(self.n_bars - 1,1), s=(1,1), a= 'relu', p = 'valid', bn = True)
output_layer = Reshape([self.n_bars, self.z_dim])(x)
return Model(input_layer, output_layer)
where self.conv_t is a wapper for Conv2DTranspose. I suppose that the point of it is to combine batch normalization and activation with the transpose convolution layer, so that you can make a simpler call to it when you are building your network. Here's the definition of conv_t:
def conv_t(self, x, f, k, s, a, p, bn):
x = Conv2DTranspose(
filters = f
, kernel_size = k
, padding = p
, strides = s
, kernel_initializer = self.weight_init
)(x)
if bn:
x = BatchNormalization(momentum = 0.9)(x)
if a == 'relu':
x = Activation(a)(x)
elif a == 'lrelu':
x = LeakyReLU()(x)
The default constants are self.z_dim = 32, self.n_bars=2.
I try to calculate the trainable parameters in transposed convolution layers the same way I do with ordinary convolution layers: input_filters * output_filters * kernel_size + output_filters. I get the following:
input_layer: 0 Reshape: 0 first conv_t: 32 * 1024 * 2 + 1024 = 66560 second conv_t: 1024 * 32 * 1 + 32 = 32800 output_layer: 0
total: 99360
However, in the summary given by gan.generator.summary(), it says there are 103584 parameters.
So, why am I missing 4224 parameters?
Edit 1: Right after asking this question, I realized that the mystery parameters might come from the batch normalization. That is probably the answer.
Yes, batch normalization has both trainable and non-trainable parameters, 2 of each for each filter of the layer that came before, so a total of 4 * filters.
For the first call to conv_t, batch normalization contributes 4 * 1024 = 4096 parameters.
For the second call to conv_t, batch normalization contributes 4 * 32 = 128 parameters.
Together, these constitute the 4224 parameters whose provenance had been a mystery to me.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.