简体   繁体   中英

PyTorch nn.CrossEntropyLoss runtime dimension out of range error

I'm currently implementing the continuous bag-of-words (CBOW) model using PyTorch. I'm facing some problems when implementing the cross entropy loss, though. Here's the portion of code that's causing the problem:

for idx, sample in enumerate(self.train_data):
    x = torch.tensor(sample[0], dtype=torch.long)
    y = np.zeros(shape=(self.vocab_size)) # self.vocab_size = 85,000
    y[int(sample[1])] = np.float64(1)
    y = torch.tensor(y, dtype=torch.long)

    if torch.cuda.is_available():
        x = x.cuda()
        y = y.cuda()

    optimizer.zero_grad()

    output = self.model(x) # output's shape is the same as self.vocab_size
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

To briefly explain my code, the model that I've implemented basically outputs the averaged embedding values of a context array and performs a linear projection to project them into a shape that's identical to the size of the vocabulary. Then we run this array through a softmax function.

The contents of self.train_data are basically (context, target_word) pairs. y is a one-hot encoded array of the token.

I'm aware that the second input to nn.CrossEntropyLoss is C = # of classes , but I'm not sure where my code went wrong. The vocabulary size is 85,000 and so aren't the number of class 85,000?

If I change the input to

loss = criterion(output, 85000)

I get the same error:

*** RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

What am I doing wrong, and how should I understand the input to PyTorch's cross entropy loss?

Thanks.

I'm aware that the second input to nn.CrossEntropyLoss is C = # of classes, but I'm not sure where my code went wrong. The vocabulary size is 85,000 and so aren't the number of class 85,000?

The number of classes (nc) may be the 85000, but you also have the batch size:

target = torch.randint (nc, (bs,))

The target represents the true value, while output is what you get from the model for the particular input x in your case output = self.model(x) .

In here

loss = criterion(output, target)

You can say the output is what you currently get from the model, and the target is what you should get when you finalize your training.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM