I am trying Bert on the Tweeter dataset. I am encountered with the following error message.
# set initial loss to infinite
best_valid_loss = float('inf')
# empty lists to store training and validation loss of each epoch
train_losses=[]
valid_losses=[]
#for each epoch
for epoch in range(epochs):
print('\n Epoch {:} / {:}'.format(epoch + 1, epochs))
#train model
train_loss, _ = train()
#evaluate model
valid_loss, _ = evaluate()
#save the best model
if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(model.state_dict(), 'saved_weights.pt')
# append training and validation loss
train_losses.append(train_loss)
valid_losses.append(valid_loss)
print(f'\nTraining Loss: {train_loss:.3f}')
print(f'Validation Loss: {valid_loss:.3f}')
It is a very long code. Searching for the issues led me to change.float() to long(). I already did that. Kindly suggest me the solution. Very Important: The same code work perfectly well on another dataset (with same number of columns and same type of data) but is not working on tweets data. (Only difference is the size. Previous was having 5500 entries while the tweet dataset has 10000 entries)
I have searched a lot for the above error. In the end, I found that the main reason for the above error is "not cleaning the dataset properly". The reason (in my case) was that the values in the label column were shown as floats, not as int. Through using pandas I changed all float values to int and after that, the code runs successfully. So give more time to data cleaning than to writing code. Thank you.
Do you have a categorical target that you are trying to predict (ie classification)? Let's say a binary target called y which has 0's and 1's? Probably your target variable is encoded not as int64
which is the Long format, but in int32
. Just convert that target variable into 64 bit before you create the DataLoader
, and you should be good.
There are 2 possible ways:
import torch
import numpy as np
y
# array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
# 1])
y.dtype
# dtype('int32')
# numpy
y = y.astype('int64')
y
# array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
# 1], dtype=int64)
or
# torch
y = torch.from_numpy(y).to(dtype=torch.long)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.