训练 CNN 时出错：“RuntimeError：张量 a (10) 的大小必须与非单维 1 的张量 b (64) 的大小相匹配”

Question

I'm new to Pytorch and I'm trying to implemente a simple CNN to recognize MNIST images.我是 Pytorch 的新手，我正在尝试实现一个简单的 CNN 来识别 MNIST 图像。

I'm training the network using MSE Loss as loss function and SGD as optimizer.我正在使用 MSE 损失作为损失 function 和 SGD 作为优化器来训练网络。 When I get to the training it gives me the following当我参加培训时，它给了我以下信息

warning: " UserWarning: Using a target size (torch.Size([64])) that is different to the input size (torch.Size([64, 10])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size."警告：“用户警告：使用与输入大小 (torch.Size([64, 10])) 不同的目标大小 (torch.Size([64]))。这可能会由于广播而导致不正确的结果。请确保它们的尺寸相同。”

And then I get the following然后我得到以下

error: "RuntimeError: The size of tensor a (10) must match the size of tensor b
       (64) at non-singleton dimension 1".

I've tried to solve it using some solutions I've found in other questions but nothing seems to work.我尝试使用在其他问题中找到的一些解决方案来解决它，但似乎没有任何效果。 Here's the code of how I load the dataset:这是我如何加载数据集的代码：

transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,),(0.5,))])

trainset = torchvision.datasets.MNIST(root='./data', train = True, transform = transform, download = True)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 64, shuffle = True)

testset = torchvision.datasets.MNIST(root='./data', train = False, transform = transform, download = True)
testloader = torch.utils.data.DataLoader(testset, batch_size = 64, shuffle = False)

The code to define my network:定义我的网络的代码：

    class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        #Convolutional layers
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 12, 5)
        #Fully connected layers
        self.fc1 = nn.Linear(12*4*4, 120)
        self.fc2 = nn.Linear(120, 60)
        self.out = nn.Linear(60,10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv2(x)), (2,2))
        x = x.reshape(-1, 12*4*4)
        x = F.relu(self.fc1(x))         
        x = F.relu(self.fc2(x))
        x = self.out(x)
        return x

And this is the training:这是培训：

net = Net()
print(net)

criterion = nn.MSELoss() 
optimizer = optim.SGD(net.parameters(), lr=0.001)
epochs = 3

for epoch in range(epochs):
    running_loss = 0;
    for images, labels in trainloader:
        optimizer.zero_grad()
        output = net(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    else:
        print(f"Training loss: {running_loss/len(trainloader)}")

print('Finished training')

Thank you!谢谢！

Answer 1

The loss you're using ( nn.MSELoss ) is incorrect for this problem.您使用的损失（ nn.MSELoss ）对于这个问题是不正确的。 You should use nn.CrossEntropyLoss .您应该使用nn.CrossEntropyLoss 。

Mean Squared Loss measures the mean squared error between input x and target y.均方损失测量输入 x 和目标 y 之间的均方误差。 Here the input and target naturally should be of the same shape.这里输入和目标自然应该是相同的形状。

Cross Entropy Loss computes the probability over the classes for each image.交叉熵损失计算每个图像的类别的概率。 The output would be a matrix N x C and target would be a vector of size N. (N = batch size, C = number of classes) output 将是一个矩阵 N x C，目标将是一个大小为 N 的向量。（N = 批量大小，C = 类数）

Since your aim is to classify the image, this is what you'll want to use.由于您的目标是对图像进行分类，因此这就是您想要使用的。

In your case, your network output will be a matrix of size 64 x 10 and target is a vector of size 64. Each row of the output matrix (after applying the softmax function indicates the probability of that class) after which the Cross entropy loss is computed. In your case, your network output will be a matrix of size 64 x 10 and target is a vector of size 64. Each row of the output matrix (after applying the softmax function indicates the probability of that class) after which the Cross entropy loss被计算。 Pytorch's nn.CrossEntropyLoss combines both the softmax operation with the loss computation. Pytorch 的nn.CrossEntropyLoss结合了 softmax 操作和损失计算。

You can refer the documentation here for more info on how Pytorch computes losses.您可以在此处参考文档以获取有关 Pytorch 如何计算损失的更多信息。

Answer 2

I agree with @AshwinNair advise and I did change in for loop in train and eval section as below it work for me.我同意@AshwinNair 的建议，并且我确实在训练和评估部分的 for 循环中进行了更改，如下所示它对我有用。

for i, (img, label) in enumerate(dataloader):

  img = img.to(device)

  label = label.to(device)`

训练 CNN 时出错：“RuntimeError：张量 a (10) 的大小必须与非单维 1 的张量 b (64) 的大小相匹配”

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-11-19 13:13:44

解决方案2
0 2022-08-11 15:35:48

训练 CNN 时出错：“RuntimeError：张量 a (10) 的大小必须与非单维 1 的张量 b (64) 的大小相匹配”

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-11-19 13:13:44

解决方案2 0 2022-08-11 15:35:48

解决方案1
3 已采纳 2019-11-19 13:13:44

解决方案2
0 2022-08-11 15:35:48