简体   繁体   中英

Increasing batch_size of dataset for Pytorch neural network

I currently have my neural network training with a batch_size =1, To run it across multiple gpus i need to increase the batch size to be larger than the amount of gpus so i want batch_size=16, although the way i have my data set up i am not sure how to change that

The data is read from a csv file

raw_data = pd.read_csv("final.csv")
train_data = raw_data[:750]
test_data = raw_data[750:]

Then the data is normalized and turned to Tensors

# normalize features
scaler = MinMaxScaler(feature_range=(-1, 1))
scaled_train = scaler.fit_transform(train_data)
scaled_test = scaler.transform(test_data)
# Turn into Tensorflow Tensors
train_data_normalized = torch.FloatTensor(scaled_train).view(-1)
test_data_normalized = torch.FloatTensor(scaled_test).view(-1)

Then the data is turned into a Tensor Tuple of [input list, output] format eg (tensor([1,3,56,63,3]),tensor([34]))

# Convert to tensor tuples
def input_series_sequence(input_data, tw):
 inout_seq = []
 L = len(input_data)
 i = 0
 for index in range(L - tw):
    train_seq = input_data[i:i + tw]
    train_label = input_data[i + tw:i + tw + 1]
    inout_seq.append((train_seq, train_label))
    i = i + tw
 return inout_seq


train_inout_seq = input_series_sequence(train_data_normalized, train_window)
test_input_seq = input_series_sequence(test_data_normalized, train_window)

And then the model is trained like so

for i in range(epochs):

for seq, labels in train_inout_seq:
    optimizer.zero_grad()
    model.module.hidden_cell = model.module.init_hidden()
    seq = seq.to(device)
    labels = labels.to(device)
    y_pred = model(seq)

    single_loss = loss_function(y_pred, labels)
    single_loss.backward()
    optimizer.step()

So i want to know how exactly to change the batch_size from 1 -> 16, Do i need to use Dataset and Dataloader? and if so how exactly would it fit in with my current code, thanks!

Edit: Model is defined like this, might have to change the forward function?

class LSTM(nn.Module):
def __init__(self, input_size=1, hidden_layer_size=100, output_size=1):
    super().__init__()
    self.hidden_layer_size = hidden_layer_size

    self.lstm = nn.LSTM(input_size, hidden_layer_size)

    self.linear = nn.Linear(hidden_layer_size, output_size)

    self.hidden_cell = (torch.zeros(1, 1, self.hidden_layer_size),
                        torch.zeros(1, 1, self.hidden_layer_size))

def init_hidden(self):
    return (torch.zeros(1, 1, self.hidden_layer_size),
            torch.zeros(1, 1, self.hidden_layer_size))

def forward(self, input_seq):
    lstm_out, self.hidden_cell = self.lstm(input_seq.view(len(input_seq), 1, -1), self.hidden_cell)
    predictions = self.linear(lstm_out.view(len(input_seq), -1))
    return predictions[-1]

You can do this by wrapping your model by a nn.DataParallel class.

model = nn.DataParallel(model)

Since I don't have access to multiple GPUs and your data right now to test, I'll direct you here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM