简体   繁体   English

PyTorch 使用 RNN 生成路径 - 与输入混淆,output,隐藏和批量大小

[英]PyTorch path generation with RNN - confusion with input, output, hidden and batch sizes

I'm new to pytorch, I followed a tutorial on sentence generation with RNN and I'm trying to modify it to generate sequences of positions, however I'm having trouble with defining the correct model parameters such as input_size, output_size, hidden_dim, batch_size.我是 pytorch 的新手,我遵循了使用 RNN 生成句子的教程,我试图修改它以生成位置序列,但是我在定义正确的 model 参数(例如 input_size、output_size、hidden_dim、批量大小。

Background: I have 596 sequences of x,y positions, each looking like [[x1,y1],[x2,y2],...,[xn,yn]].背景:我有 596 个 x,y 位置序列,每个序列看起来像 [[x1,y1],[x2,y2],...,[xn,yn]]。 Each sequence represents the 2D path of a vehicle.每个序列代表车辆的二维路径。 I would like to to train a model that, given a starting point (or a partial sequence), could generate one of these sequences.我想训练一个 model,给定一个起点(或部分序列),它可以生成这些序列之一。

-I have padded/truncated the sequences so that they all have length 50, meaning each sequence is an array of shape [50,2] -我已经填充/截断了序列,使它们的长度都为 50,这意味着每个序列都是一个形状为 [50,2] 的数组

-I then divided this data into input_seq and target_seq: -然后我将这些数据分为input_seq和target_seq:

input_seq: tensor of torch.Size([596, 49, 2]). input_seq:torch.Size([596, 49, 2]) 的张量。 contains all the 596 sequences, each without its last position.包含所有 596 个序列,每个序列都没有最后一个 position。

target_seq: tensor of torch.Size([596, 49, 2]). target_seq:torch.Size([596, 49, 2]) 的张量。 contains all the 596 sequences, each without its first position.包含所有 596 个序列,每个序列都没有第一个 position。

The model class: model class:

class Model(nn.Module):
def __init__(self, input_size, output_size, hidden_dim, n_layers):
    super(Model, self).__init__()
    # Defining some parameters
    self.hidden_dim = hidden_dim
    self.n_layers = n_layers
    #Defining the layers
    # RNN Layer
    self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
    # Fully connected layer
    self.fc = nn.Linear(hidden_dim, output_size)

def forward(self, x):
    batch_size = x.size(0)      
    # Initializing hidden state for first input using method defined below
    hidden = self.init_hidden(batch_size)
    # Passing in the input and hidden state into the model and obtaining outputs
    out, hidden = self.rnn(x, hidden)
    # Reshaping the outputs such that it can be fit into the fully connected layer
    out = out.contiguous().view(-1, self.hidden_dim)
    out = self.fc(out)        
    return out, hidden

def init_hidden(self, batch_size):
    # This method generates the first hidden state of zeros which we'll use in the forward pass
    # We'll send the tensor holding the hidden state to the device we specified earlier as well
    hidden = torch.zeros(self.n_layers, batch_size, self.hidden_dim)
    return hidden

I instantiate the model with the following parameters:我使用以下参数实例化 model:

input_size of 2 (an [x,y] position) input_size 为 2([x,y] 位置)

output_size of 2 (an [x,y] position) output_size 为 2([x,y] 位置)

hidden_dim of 2 (an [x,y] position) (or should this be 50 as in the length of a full sequence?) hidden_dim 为 2([x,y] 位置)(或者在完整序列的长度中应该为 50?)

model = Model(input_size=2, output_size=2, hidden_dim=2, n_layers=1)
n_epochs = 100
lr=0.01
# Define Loss, Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

# Training Run
for epoch in range(1, n_epochs + 1):
    optimizer.zero_grad() # Clears existing gradients from previous epoch
    output, hidden = model(input_seq)
    loss = criterion(output, target_seq.view(-1).long())
    loss.backward() # Does backpropagation and calculates gradients
    optimizer.step() # Updates the weights accordingly
    if epoch%10 == 0:
        print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
        print("Loss: {:.4f}".format(loss.item()))

When I run the training loop, it fails with this error:当我运行训练循环时,它失败并出现以下错误:

ValueError                                Traceback (most recent call last)
<ipython-input-9-ad1575e0914b> in <module>
      3     optimizer.zero_grad() # Clears existing gradients from previous epoch
      4     output, hidden = model(input_seq)
----> 5     loss = criterion(output, target_seq.view(-1).long())
      6     loss.backward() # Does backpropagation and calculates gradients
      7     optimizer.step() # Updates the weights accordingly
...

ValueError: Expected input batch_size (29204) to match target batch_size (58408).

I tried modifying input_size, output_size, hidden_dim and batch_size and reshaping the tensors, but the more I try the more confused I get.我尝试修改 input_size、output_size、hidden_dim 和 batch_size 并重塑张量,但我尝试的越多,我就越困惑。 Could someone point out what I am doing wrong?有人可以指出我做错了什么吗?

Furthermore, since batch size is defined as x.size(0) in Model.forward(self,x), this means I only have a single batch of size 596 right?此外,由于批次大小在 Model.forward(self,x) 中定义为 x.size(0),这意味着我只有一个批次大小为 596,对吗? What would be the correct way to have multiple smaller batches?拥有多个小批量的正确方法是什么?

The output has size [batch_size * seq_len, 2] = [29204, 2] , and you flatten the target_seq , which has size [batch_size * seq_len * 2] = [58408] . output的大小为[batch_size * seq_len, 2] = [29204, 2] ,然后将target_seq展平,其大小为[batch_size * seq_len * 2] = [58408] They don't have the same number of dimensions, while having the same number of total elements, therefore the first dimensions are not identical.它们没有相同数量的维度,但具有相同数量的总元素,因此第一个维度不相同。

Regardless of the dimension mismatch, nn.CrossEntropyLoss is a categorical loss function, which means it would only predict a class from the output. You don't have any classes, but you are trying to predict coordinates, which are continuous values.无论维度不匹配如何, nn.CrossEntropyLoss都是分类损失 function,这意味着它只会从 output 预测 class。您没有任何类别,但您正在尝试预测坐标,这些坐标是连续值。 For this you need to use a regression loss function, such as nn.MSELoss , which calculates the squared error/distance between the predicted and target coordinates.为此,您需要使用回归损失 function,例如nn.MSELoss ,它计算预测坐标和目标坐标之间的平方误差/距离。

criterion = nn.MSELoss()

# .flatten() does the same thing as .view(-1) but is more descriptive
loss = criterion(output.flatten(), target_seq.flatten())

The flattening can be avoided as the loss functions as well as the linear layer can operate on multidimensional inputs, which removes the potential risk of getting lost with the flattening and restoring of the dimensions, and the output is more comprehensible to inspect or use later outside of the training.由于损失函数和线性层可以在多维输入上运行,因此可以避免展平,这消除了因展平和恢复维度而迷路的潜在风险,并且 output 更易于在外部检查或使用的培训。 For the linear layer, only the last dimension of the input needs to match the in_features of nn.Linear , which is hidden_dim in your case.对于线性层,只有输入的最后一个维度需要与nn.Linearin_features匹配,在您的情况下是hidden_dim

def forward(self, x):
    batch_size = x.size(0)      
    # Initializing hidden state for first input using method defined below
    hidden = self.init_hidden(batch_size)
    # Passing in the input and hidden state into the model and obtaining outputs
    # out size: [batch_size, seq_len, hidden_dim]
    out, hidden = self.rnn(x, hidden)
    # out size: [batch_size, seq_len, output_size]
    out = self.fc(out)        
    return out, hidden

Now the output of the model has the same size as the target_seq and you can directly call the loss function without flattening:现在 model 的 output 与target_seq的大小相同,您可以直接调用损失 function 而无需展平:

loss = criterion(output, target_seq)

hidden_dim of 2 (an [x,y] position) (or should this be 50 as in the length of a full sequence?) hidden_dim 为 2([x,y] 位置)(或者在完整序列的长度中应该为 50?)

The hidden_dim is not a pair of [x, y] and is completely unrelated to both the input_size and output_size . hidden_dim不是一对 [x, y] 并且与input_sizeoutput_size完全无关。 It defines the number of hidden features of the RNN, which is kind of its complexity, and bigger sizes potentially have more room to retain essential information, but also require more computations.它定义了 RNN 的隐藏特征的数量,这是它的复杂性,更大的尺寸可能有更多的空间来保留基本信息,但也需要更多的计算。 There is no perfect hidden size and it largely depends on the use case.没有完美的隐藏大小,这在很大程度上取决于用例。 You can experiment with different sizes, eg 100, 256, etc. and see whether that improves your results.您可以尝试不同的大小,例如 100、256 等,看看是否会改善您的结果。

Furthermore, since batch size is defined as x.size(0) in Model.forward(self,x), this means I only have a single batch of size 596 right?此外,由于批次大小在 Model.forward(self,x) 中定义为 x.size(0),这意味着我只有一个批次大小为 596,对吗? What would be the correct way to have multiple smaller batches?拥有多个小批量的正确方法是什么?

Yes, you only have a single batch of size 596. If you want to use smaller batches, for example if you cannot fit all of them into a more complex model, you could easily use slices of them, but it would be better to use PyTorch's data utilities: torch.utils.data.TensorDataset to get a dataset, where each sequence of the input has a corresponding target, in combination with torch.utils.data.DataLoader to create batches for you.是的,你只有一个大小为 596 的批次。如果你想使用较小的批次,例如,如果你不能将它们全部放入更复杂的 model 中,你可以轻松地使用它们的切片,但最好使用PyTorch 的数据实用程序: torch.utils.data.TensorDataset用于获取数据集,其中每个输入序列都有对应的目标,结合torch.utils.data.DataLoader为您创建批处理。

from torch.utils.data import DataLoader, TensorDataset

# Match each sequence of the input_seq to the corresponding target_seq.
# e.g. dataset[0] == (input_seq[0], target_seq[0])
dataset = TensorDataset(input_seq, target_seq)

# Randomly shuffle the data and load it in batches of 16
data_loader = DataLoader(dataset, batch_size=16, shuffle=True)

# Process one batch at a time
for input, target in data_loader:
    output, hidden = model(input)
    loss = criterion(output, target)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM