简体   繁体   English

批量归一化中运行平均值和样本平均值之间的量纲差异

[英]Dimensional Difference between Running mean and Sample mean in Batch normalization

I am recently teaching my self through cs231n online, and in batch normalizetion assignment, specifically in running mean calculation:我最近通过 cs231n 在线自学,在批量归一化分配中,特别是运行均值计算:
running_mean = momentum * running_mean + (1 - momentum) * sample_mean
the running_mean is set by running_mean
running_mean = bn_param.get("running_mean", np.zeros(D, dtype=x.dtype)) . running_mean = bn_param.get("running_mean", np.zeros(D, dtype=x.dtype))
So when you have more than one batchnorm layer, the running_mean value is inherited from the last batchnorm layer, but the sample_mean is obtained by current layer input, which result in所以当你有多个batchnorm层时, running_mean值继承自最后一个batchnorm层,但sample_mean是当前层输入获得的,这导致

File ~/assignment/assignment2/cs231n/layers.py:217, in batchnorm_forward(x, gamma, beta, bn_param)
    213 out = x_hat * gamma + beta
    215 print(running_mean.shape, miu.shape)
--> 217 running_mean = momentum * running_mean + (1 - momentum) * miu
    218 running_var = momentum * running_var + (1 - momentum) * sigma_squared
    220 cache = miu, sigma_squared, eps, N, x_hat, x, gamma

ValueError: operands could not be broadcast together with shapes (1,20) (1,30) 

what did I miss here?我在这里错过了什么? The derivation seems right though推导似乎是正确的

I tried to implement the batchnorm layer, but the dimension is different between running_mean and sample_mean.我尝试实现 batchnorm 层,但 running_mean 和 sample_mean 的维度不同。

This is what I have:这就是我所拥有的:

        miu = np.mean(x, axis=0)
        var = np.var(x, axis=0)
        x_hat = (x - miu) / np.sqrt(var + eps)
        out = x_hat * gamma + beta
        print(running_mean.shape, miu.shape)
        running_mean = momentum * running_mean + (1 - momentum) * miu
        running_var = momentum * running_var + (1 - momentum) * var
        cache = miu, var, eps, N, x_hat, x, gamma

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 TensorFlow批量规范化实现之间有什么区别? - What is the difference between the TensorFlow batch normalization implementations? Tensorflow中的批量标准化层未更新其移动平均值和移动方差 - Batch normalization layer in Tensorflow is not updating its moving mean and moving variance 批量归一化导致训练和推理损失之间的巨大差异 - Batch Normalization causes huge difference between training and inference loss np.mean和pandas.mean之间的区别 - The difference between np.mean and pandas.mean np.mean 和 tf.reduce_mean 有什么区别? - What is the difference between np.mean and tf.reduce_mean? 平均规范化不同版本的代码 - Mean normalization different versions of code python 和 R 之间的差异结果,均值和协方差 - Difference results between python and R, mean and covariance 如何使用 tf.nn.batch_normalization 处理移动均值和移动方差? - How to take care of moving mean and moving variance using tf.nn.batch_normalization? tf.keras中的批量标准化不计算平均方差和平均方差 - Batch Normalization in tf.keras does not calculate average mean and average variance 有什么方法可以将一个Pytorch模型的所有参数复制到另一个专门的“批量归一化”平均值和std? - Is there any way to copy all parameters of one Pytorch model to another specially Batch Normalization mean and std?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM