[英]Layer normalization in pytorch
I'm trying to test layer normalization function of PyTorch.我正在尝试测试 PyTorch 的层规范化 function。
But I don't know why b[0]
and result have different values here但我不知道为什么b[0]
和 result 在这里有不同的值
Did I do something wrong?我做错什么了吗?
import numpy as np
import torch
import torch.nn as nn
a = torch.randn(1, 5)
m = nn.LayerNorm(a.size()[1:], elementwise_affine= False)
b = m(a)
Result:结果:
input: a[0] = tensor([-1.3549, 0.3857, 0.1110, -0.8456, 0.1486])
output: b[0] = tensor([-1.5561, 1.0386, 0.6291, -0.7967, 0.6851])
mean = torch.mean(a[0])
var = torch.var(a[0])
result = (a[0]-mean)/(torch.sqrt(var+1e-5))
Result:结果:
result = tensor([-1.3918, 0.9289, 0.5627, -0.7126, 0.6128])
And, for n*2
normalization, the result of pytorch layer norm is always [1.0, -1.0]
(or [-1.0, 1.0]
).而且,对于n*2
归一化, pytorch 层范数的结果始终为[1.0, -1.0]
(或[-1.0, 1.0]
)。 I can't understand why.我不明白为什么。 Please let me know if you have any hints如果您有任何提示,请告诉我
a = torch.randn(1, 2)
m = nn.LayerNorm(a.size()[1:], elementwise_affine= False)
b = m(a)
Result:结果:
b = tensor([-1.0000, 1.0000])
For calculating the variance use torch.var(a[0], unbiased=False)
.要计算方差,请使用torch.var(a[0], unbiased=False)
。 Then you will get the same result.然后你会得到同样的结果。 By default pytorch calculates the unbiased estimation of the variance.默认情况下 pytorch 计算方差的无偏估计。
For your 1st question , as @Theodor said, you need to use unbiased=False
unbiased when calculating variance.对于您的第一个问题,正如@Theodor 所说,您需要在计算方差时使用unbiased=False
unbiased。
Only if you want to explore more: As your input size is 5, unbiased estimation of variance will be 5/4 = 1.25
times the biased estimation.仅当您想探索更多时:由于您的输入大小为 5,方差的无偏估计将是有偏估计的5/4 = 1.25
倍。 Because unbiased estimation uses N-1
instead of N
in the denominator.因为无偏估计在分母中使用N-1
而不是N
As a result, each value of result
that you generated, is sqrt(4/5) = 0.8944
times the values of b[0]
. result
值都是sqrt(4/5) = 0.8944
乘以b[0]
的值。
About your 2nd question :关于你的第二个问题:
And, for n*2 normalization, the result of pytorch layer norm is always
[1.0, -1.0]
并且,对于 n*2 归一化,pytorch 层范数的结果始终为[1.0, -1.0]
This is reasonable.这是合理的。 Suppose only two elements are a
and b
.假设只有两个元素是a
和b
。 So, mean will be (a+b)/2
and variance ((ab)^2)/4
.因此,均值将是(a+b)/2
和方差((ab)^2)/4
。 So, the normalization result will be [((ab)/2) / (sqrt(variance)) ((ba)/2) / (sqrt(variance))]
which is essentially [1, -1]
or [-1, 1]
depending on a > b
or a < b
.因此,归一化结果将是[((ab)/2) / (sqrt(variance)) ((ba)/2) / (sqrt(variance))]
本质上是[1, -1]
或[-1, 1]
取决于a > b
或a < b
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.