Pytorch 损失函数维度不匹配

Question

I'm trying to run word embeddings using batch training , as shown below.我正在尝试使用批量训练运行词嵌入，如下所示。

def forward(self, inputs):
    print(inputs.shape)
    embeds = self.embeddings(inputs)
    print(embeds.shape)
    out = self.linear1(embeds)
    print(out.shape)
    out = self.activation_function1(out)
    print(out.shape)
    out = self.linear2(out).cuda()
    print(out.shape)
    out = self.activation_function2(out)
    print(out.shape)
    return out.cuda()

Here, I'm using context size 4, batch size 32, embedding size 50, hidden layer size 64, vocab size 9927在这里，我使用上下文大小 4，批量大小 32，嵌入大小 50，隐藏层大小 64，词汇大小 9927

The output of the "shape" functions is “形状”函数的输出是

print(inputs.shape) ----> torch.Size([4, 32])打印(inputs.shape) ----> torch.Size([4, 32])

print(embeds.shape) ----> torch.Size([4, 32, 50])打印(embeds.shape) ----> torch.Size([4, 32, 50])

print(out.shape) ----> torch.Size([4, 32, 64])打印（输出。形状）---->火炬。大小（[4, 32, 64]）

print(out.shape) ----> torch.Size([4, 32, 9927])打印（out.shape）----> torch.Size（[4, 32, 9927]）

Are the shapes of these correct?这些形状是否正确？ I'm quite confused.我很困惑。

Also, when I train, it returns an error:此外，当我训练时，它返回一个错误：

def train(epoch):
  model.train()
  for batch_idx, (data, target) in enumerate(train_loader, 0):
    optimizer.zero_grad()
    output = model(torch.stack(data))
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()

I'm getting an error in the line "loss = criterion(output, target)".我在“损失 = 标准（输出，目标）”行中遇到错误。 It says "Expected input batch_size (4) to match target batch_size (32)."它说“预期输入batch_size（4）匹配目标batch_size（32）。” Are my shapes for the "forward" function correct?我的“转发”功能的形状是否正确？ I'm not that familiar with batch training.我对批量训练不太熟悉。 How do I make the dimensions match?如何使尺寸匹配？

-------EDIT: Posting init code below ----- -------编辑：在下面发布初始化代码-----

  def __init__(self, vocab_size, embedding_dim):
    super(CBOW, self).__init__()
    self.embeddings = nn.Embedding(vocab_size, embedding_dim)
    self.linear1 = nn.Linear(embedding_dim, 64)
    self.activation_function1 = nn.ReLU()
    self.linear2 = nn.Linear(64, vocab_size)
    self.activation_function2 = nn.LogSoftmax(dim = -1)

Answer 1

torch.nn.Linear 's forward method needs batch size as first argument. torch.nn.Linear的forward方法需要批量大小作为第一个参数。

You are supplying it as second (first being timesteps), use permute(1, 0, 2) to make them first.您将其作为第二个（第一个是时间步长）提供，使用permute(1, 0, 2)使它们首先。

Furthermore, linear layers usually take 2D input, with first being batch and second being dimension of input.此外，线性层通常采用二维输入，第一个是批处理，第二个是输入的维度。 Yours is 3d because of words (I assume), maybe you want to use recurrent neural networks (eg torch.nn.LSTM )?你是 3d 因为单词（我假设），也许你想使用循环神经网络（例如torch.nn.LSTM ）？

Pytorch 损失函数维度不匹配

问题描述

1 个解决方案

解决方案1
0 2019-03-22 17:58:03

Pytorch 损失函数维度不匹配

问题描述

1 个解决方案

解决方案1 0 2019-03-22 17:58:03

解决方案1
0 2019-03-22 17:58:03