简体   繁体   中英

How should I use num_layers in pytorch LSTM model?

Hi I am new bee with RNN and pytorch. I am implementing an model to predict data. I first only use single layer and the result was fine. Now I want to improve the accurancy of the model and want to use 2 layers in LSTM. However, the output of LSTM is very strange to me.

I was expecting have a [1, 8] output. However, with num_layers=2 I get a [2, 8] result. What does this result mean? Which should I use as the result of LSTM?

Here is my code:

class LSTM(nn.Module):

    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):

        super(LSTM, self).__init__()
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers

        self.lstm = nn.LSTM(input_size=input_dim, hidden_size=hidden_dim,
                            num_layers=num_layers, batch_first=True)
        
        self.fc = nn.Linear(hidden_dim, output_dim)

    
    def forward(self, x):

        #x = Variable(torch.Tensor(x).cuda())
        
        h_0 = Variable(torch.zeros(
            self.num_layers, x.size(0), self.hidden_dim).cuda())
        
        c_0 = Variable(torch.zeros(
            self.num_layers, x.size(0), self.hidden_dim).cuda())
        
        
        # Propagate input through LSTM
        ula, (h_out, _) = self.lstm(x, (h_0, c_0))
         
        
        out = self.fc(h_out).cuda()
        
        return out

h_out has the shape of (hidden_size,numlayers). If you want to pass it to the output layer fc, you need to reshape it/flatten and also increase the size of the output layer to hidden_size*numlayers.

class LSTM(nn.Module):

    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):

        super(LSTM, self).__init__()
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers

        self.lstm = nn.LSTM(input_size=input_dim, hidden_size=hidden_dim,
                            num_layers=num_layers, batch_first=True)
        
        self.fc = nn.Linear(hidden_dim*num_layers, output_dim)

    
    def forward(self, x):

        #x = Variable(torch.Tensor(x).cuda())
        
        h_0 = Variable(torch.zeros(
            self.num_layers, x.size(0), self.hidden_dim).cuda())
        
        c_0 = Variable(torch.zeros(
            self.num_layers, x.size(0), self.hidden_dim).cuda())
        
        
        # Propagate input through LSTM
        ula, (h_out, _) = self.lstm(x, (h_0, c_0))
         
        out = h_out.transpose(0,1)
        out = out.reshape(-1,hidden_dim*num_layers)
        out = self.fc(out).cuda()
        
        return out

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM