简体   繁体   English

Pytorch:尺寸不匹配错误,尽管矩阵的尺寸匹配(m1:[256 x 200],m2:[256 x 200])

[英]Pytorch: size mismatch error although the sizes of the matrices do match (m1: [256 x 200], m2: [256 x 200])

I trying to do transfer learning by pre training (Self supervised learning) a model on rotation (0, 90, 180, dn 270 degrees: 4 labels) on unlabelled data.我试图通过对未标记数据进行旋转(0、90、180、dn 270 度:4 个标签)的 model 预训练(自我监督学习)来进行迁移学习。 Here is the model:这是 model:

class RotNet1(nn.Module):
    def __init__(self):
        keep_prob = 0.9
        super(RotNet1, self).__init__()
        self.layer1 = nn.Sequential(nn.Conv2d(in_channels = 3,
                                                  out_channels = 80,
                                                  kernel_size = 7,
                                                  stride = 1,
                                                  padding = 0),
                                        nn.ReLU(),
                                        nn.MaxPool2d(kernel_size = 2, 
                      stride = 2,
                      padding = 1),
                      nn.Dropout(p=1 - keep_prob)
                      )
        self.bn1 = nn.BatchNorm2d(num_features = 80)
        self.dropout1 = nn.Dropout2d(p=0.02)
        self.layer2 = nn.Sequential(nn.Conv2d(in_channels = 80,
                                                  out_channels = 128,
                                                  kernel_size = 3,
                                                  stride = 1,
                                                  padding = 1),
                                        nn.ReLU(),
                                        nn.MaxPool2d(kernel_size = 2, 
                      stride = 2,
                      padding = 1),
                      nn.Dropout(p=1 - keep_prob)
                      )
        self.bn2 = nn.BatchNorm2d(num_features = 128)
        self.layer3 = nn.Sequential(nn.Conv2d(in_channels = 128,
                                                  out_channels = 256,
                                                  kernel_size = 3,
                                                  stride = 1,
                                                  padding = 0),
                                        nn.ReLU(),
                                        nn.MaxPool2d(kernel_size = 2, 
                      stride = 2,
                      padding = 1),
                      nn.Dropout(p=1 - keep_prob)
                      )   
        self.bn3 = nn.BatchNorm2d(num_features = 256)
        self.layer4 = nn.Sequential(nn.Conv2d(in_channels = 256,
                                                  out_channels = 512,
                                                  kernel_size = 3,
                                                  stride = 1,
                                                  padding = 0),
                                        nn.ReLU(),
                                        nn.MaxPool2d(kernel_size = 2, 
                      stride = 2,
                      padding = 1),
                      nn.Dropout(p=1 - keep_prob)
                      ) 
        self.bn4 = nn.BatchNorm2d(num_features = 512)
        self.layer5 = nn.Sequential(nn.Conv2d(in_channels = 512,
                                                  out_channels = 512,
                                                  kernel_size = 3,
                                                  stride = 1,
                                                  padding = 0),
                                        nn.ReLU(),
                                        nn.MaxPool2d(kernel_size = 2, 
                      stride = 2,
                      padding = 1),
                      nn.Dropout(p=1 - keep_prob)
                      ) 
        self.bn5 = nn.BatchNorm2d(num_features = 512)
        self.drop_out = nn.Dropout()
        self.fc1 = nn.Linear(512* 2 * 2, 200)
        self.fc2 = nn.Linear(200, 4)
        #self.fc3 = nn.Linear(200, 100)


    def forward(self, input):
        out = self.layer1(input)
        out = self.bn1(out)
        out = self.dropout1(out)
        out = self.layer2(out)
        out = self.bn2(out)
        out = self.layer3(out)
        out = self.bn3(out)
        out = self.layer4(out)
        out = self.bn4(out)
        out = self.layer5(out)
        out = self.bn5(out)
        out = out.reshape(out.size(0), -1)
        out = self.drop_out(out)
        out = self.fc1(out)
        out = self.fc2(out)
        #out = self.fc3(out)
        return out

I trained this model on those 4 labels and names the model model_ssl .我在这 4 个标签上训练了这个 model 并命名为 model model_ssl I then copied the model and changed the number the last fully connected layer from 4 to 200 (which is the number of labels in the labelled training and validation set where the number of example is restricted:然后我复制了 model 并将最后一个全连接层的数量从 4 更改为 200(这是标记的训练和验证集中的标签数量,其中示例数量受到限制:

model_a = copy.copy(model_ssl)
#model_a.classifier
num_classes = 200
model_a.fc2 = nn.Linear(256,num_classes).cuda()

model_a.to(device)
loss_fn = torch.nn.CrossEntropyLoss()
n_epochs_a = 20
learning_rate_a = 0.01
alpha_a = 1e-5
momentum_a = 0.9
optimizer = torch.optim.SGD(model_a.parameters(), 
                            momentum = momentum_a,
                            nesterov=True,
                            weight_decay = alpha_a,
                            lr=learning_rate_a)
train_losses_a, val_losses_a, train_acc_a, val_acc_a = train(model_a, 
                                                             train_dataloader_sl, 
                                                             val_dataloader_sl, 
                                                             optimizer, 
                                                             n_epochs_a, 
                                                             loss_fn)

Here is the error message:这是错误消息:

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-27-f6f362ba8c53> in <module>()
     15                                                              optimizer,
     16                                                              n_epochs_a,
---> 17                                                              loss_fn)

6 frames

<ipython-input-23-df58f17c5135> in train(model, train_dataloader, val_dataloader, optimizer, n_epochs, loss_function)
     57     for epoch in range(n_epochs):
     58         model.train()
---> 59         train_loss, train_accuracy = train_epoch(model, train_dataloader, optimizer, loss_fn)
     60         model.eval()
     61         val_loss, val_accuracy = evaluate(model, val_dataloader, loss_fn)

<ipython-input-23-df58f17c5135> in train_epoch(model, train_dataloader, optimizer, loss_fn)
     10         labels = labels.to(device=device, dtype=torch.int64)
     11         # Run predictions
---> 12         output = model(images)
     13         # Set gradients to zero
     14         optimizer.zero_grad()

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-11-2cd851b6d8e4> in forward(self, input)
     85         out = self.drop_out(out)
     86         out = self.fc1(out)
---> 87         out = self.fc2(out)
     88         #out = self.fc3(out)
     89         return out

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py in forward(self, input)
     85 
     86     def forward(self, input):
---> 87         return F.linear(input, self.weight, self.bias)
     88 
     89     def extra_repr(self):

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
   1608     if input.dim() == 2 and bias is not None:
   1609         # fused op is marginally faster
-> 1610         ret = torch.addmm(bias, input, weight.t())
   1611     else:
   1612         output = input.matmul(weight.t())

RuntimeError: size mismatch, m1: [256 x 200], m2: [256 x 200] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:283

The size of the matrices m1 and m2 seems to match but there is still that error message.矩阵m1m2的大小似乎匹配,但仍然存在该错误消息。 What should I do?我应该怎么办?

The output shape of fc1 has an output size of 200, so the input size of fc2 should be 200 not 256, num_classes and 256 should be switched: fc1的 output 形状的 output 大小为 200,因此fc2的输入大小应为 200 而不是 256,应切换num_classes和 256:

num_classes = 200
model_a.fc2 = nn.Linear(num_classes, 256).cuda()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pytorch GRU 错误 RuntimeError:尺寸不匹配,m1:[1600 x 3],m2:[50 x 20] - Pytorch GRU error RuntimeError : size mismatch, m1: [1600 x 3], m2: [50 x 20] 初学者 PyTorch:运行时错误:大小不匹配,m1:[16 x 2304000],m2:[600 x 120] - Beginner PyTorch : RuntimeError: size mismatch, m1: [16 x 2304000], m2: [600 x 120] Pytorch RuntimeError:大小不匹配,m1:[1 x 7744],m2:[400 x 120] - Pytorch RuntimeError: size mismatch, m1: [1 x 7744], m2: [400 x 120] RuntimeError: size mismatch, m1: [4 x 784], m2: [4 x 784] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:136 - RuntimeError: size mismatch, m1: [4 x 784], m2: [4 x 784] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:136 RuntimeError:大小不匹配,m1:[5 x 10],m2:[5 x 32] 在 /pytorch/aten/src/TH/generic/THTensorMath.cpp - RuntimeError: size mismatch, m1: [5 x 10], m2: [5 x 32] at /pytorch/aten/src/TH/generic/THTensorMath.cpp 如何修复此 RuntimeError:大小不匹配,m1:[64 x 103],m2:[550 x 50] - How do I fix this RuntimeError: size mismatch, m1: [64 x 103], m2: [550 x 50] python 中的 CNN 模块给出错误大小不匹配,m1:[12288 x 26],m2:[12288 x 26] - CNN module in python gives error size mismatch, m1: [12288 x 26], m2: [12288 x 26] RuntimeError:大小不匹配,m1:[28 x 28],m2:[784 x 128] - RuntimeError: size mismatch, m1: [28 x 28], m2: [784 x 128] RuntimeError:尺寸不匹配,m1:[32 x 1],m2:[32 x 9] - RuntimeError: size mismatch, m1: [32 x 1], m2: [32 x 9] Pytorch vsion 大小不匹配,m1 - Pytorch vsion size mismatch, m1
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM