[英]Why is the shape of the weight different?(Pytorch)
Im trying to to do SoftMaxRegression in 2ways我试图以两种方式进行 SoftMaxRegression
First Way:第一种方式:
x_train = torch.FloatTensor([[1, 2, 1, 1],
[2, 1, 3, 2],
[3, 1, 3, 4],
[4, 1, 5, 5],
[1, 7, 5, 5],
[1, 2, 5, 6],
[1, 6, 6, 6],
[1, 7, 7, 7]])
y_train=torch.LongTensor([2, 2, 2, 1, 1, 1, 0, 0])
y_one_hot=torch.zeros(8,3)
y_one_hot.scatter_(1,y_train.unsqueeze(1),1)
W=torch.zeros((4,3),requires_grad=True)
b=torch.zeros((1,3),requires_grad=True)
optimizer=optim.SGD([W,b],lr=0.1)
nb_epoch=1000
for epoch in range(nb_epoch+1):
hypothesis=F.softmax(x_train.matmul(W)+b,dim=1)
cost=(y_one_hot*-torch.log(hypothesis)).sum(dim=1).mean()
optimizer.zero_grad()
cost.backward()
optimizer.step()
if epoch % 100==0:
print('{0}th epoch Cost:{1} '.format(epoch,cost.item()))
print('W:',W.shape)
print('B:',b.shape)
Then, W.shape is torch.Size([4, 3])那么,W.shape 就是 torch.Size([4, 3])
However, If I USE THE second Way:但是,如果我使用第二种方式:
class SoftmaxClassifierModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(4, 3)
def forward(self, x):
return self.linear(x)
model = SoftmaxClassifierModel()
optimizer = optim.SGD(model.parameters(), lr=0.1)
nb_epochs = 1000
for epoch in range(nb_epochs + 1):
prediction = model(x_train)
cost = F.cross_entropy(prediction, y_train)
optimizer.zero_grad()
cost.backward()
optimizer.step()
if epoch % 100 == 0:
print('Epoch {:4d}/{} Cost: {:.6f}'.format(
epoch, nb_epochs, cost.item()
))
for name, param in model.state_dict().items():
print(name, param.size())
Then, the output is linear.weight torch.Size([3, 4]) linear.bias torch.Size([3])那么,output 是 linear.weight torch.Size([3, 4]) linear.bias torch.Size([3])
Shouldn't the size of the linear.weight torch.Size be (4,3)? linear.weight torch.Size 的大小不应该是(4,3)吗?
From the docs of torch.nn.Linear:来自 torch.nn.Linear 的文档:
Init signature: torch.nn.Linear(in_features: int, out_features: int, bias: bool = True) -> None
Docstring:
Applies a linear transformation to the incoming data: :math:`y = xA^T + b`
The weight matrix is transposed before it is applied to your data.权重矩阵在应用于您的数据之前会被转置。 So the shape is correct.所以形状是正确的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.