为什么重量的形状不同？（Pytorch）

Question

Im trying to to do SoftMaxRegression in 2ways我试图以两种方式进行 SoftMaxRegression

First Way:第一种方式：

x_train = torch.FloatTensor([[1, 2, 1, 1],
           [2, 1, 3, 2],
           [3, 1, 3, 4],
           [4, 1, 5, 5],
           [1, 7, 5, 5],
           [1, 2, 5, 6],
           [1, 6, 6, 6],
           [1, 7, 7, 7]])
y_train=torch.LongTensor([2, 2, 2, 1, 1, 1, 0, 0])

y_one_hot=torch.zeros(8,3)
y_one_hot.scatter_(1,y_train.unsqueeze(1),1)

W=torch.zeros((4,3),requires_grad=True)
b=torch.zeros((1,3),requires_grad=True)
optimizer=optim.SGD([W,b],lr=0.1)

nb_epoch=1000
for epoch in range(nb_epoch+1):
    hypothesis=F.softmax(x_train.matmul(W)+b,dim=1)
    cost=(y_one_hot*-torch.log(hypothesis)).sum(dim=1).mean()
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()
    if epoch % 100==0:
        print('{0}th epoch Cost:{1} '.format(epoch,cost.item()))
print('W:',W.shape)
print('B:',b.shape)

Then, W.shape is torch.Size([4, 3])那么，W.shape 就是 torch.Size([4, 3])

However, If I USE THE second Way:但是，如果我使用第二种方式：

class SoftmaxClassifierModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(4, 3) 

    def forward(self, x):
        return self.linear(x)
model = SoftmaxClassifierModel()
optimizer = optim.SGD(model.parameters(), lr=0.1)

nb_epochs = 1000
for epoch in range(nb_epochs + 1):

    
    prediction = model(x_train)

    
    cost = F.cross_entropy(prediction, y_train)

    
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    
    if epoch % 100 == 0:
        print('Epoch {:4d}/{} Cost: {:.6f}'.format(
            epoch, nb_epochs, cost.item()
        ))
for name, param in model.state_dict().items():
    print(name, param.size())

Then, the output is linear.weight torch.Size([3, 4]) linear.bias torch.Size([3])那么，output 是 linear.weight torch.Size([3, 4]) linear.bias torch.Size([3])

Shouldn't the size of the linear.weight torch.Size be (4,3)? linear.weight torch.Size 的大小不应该是（4,3）吗？

Answer 1

From the docs of torch.nn.Linear:来自 torch.nn.Linear 的文档：

Init signature: torch.nn.Linear(in_features: int, out_features: int, bias: bool = True) -> None
Docstring:     
Applies a linear transformation to the incoming data: :math:`y = xA^T + b`

The weight matrix is transposed before it is applied to your data.权重矩阵在应用于您的数据之前会被转置。 So the shape is correct.所以形状是正确的。

为什么重量的形状不同？（Pytorch）

问题描述

1 个解决方案

解决方案1
0 2020-12-10 21:08:32

为什么重量的形状不同？（Pytorch）

问题描述

1 个解决方案

解决方案1 0 2020-12-10 21:08:32

解决方案1
0 2020-12-10 21:08:32