简体   繁体   English

RNN 中的隐藏大小与输入大小

[英]Hidden size vs input size in RNN

Premise 1:前提1:

Regarding neurons in a RNN layer - it is my understanding that at "each time step, every neuron receives both the input vector x (t) and the output vector from the previous time step y (t –1)" [1] :关于 RNN 层中的神经元 - 我的理解是“在每个时间步长,每个神经元都接收输入向量 x (t) 和来自前一个时间步长 y (t –1) 的输出向量” [1]

https://github.com/ebolotin6/ebolotin6.github.io/blob/master/images/rnn.png

Premise 2:前提2:

It is also my understanding that in Pytorch's GRU layer, input_size and hidden_size mean the following:也是我的理解,在Pytorch的GRU层中, input_sizehidden_​​size的含义如下:

  • input_size – The number of expected features in the input x input_size – 输入 x 中预期特征的数量
  • hidden_size – The number of features in the hidden state h hidden_​​size – 隐藏状态的特征数 h

So naturally, hidden_size should represent the number of neurons in a GRU layer.所以很自然地, hidden_​​size应该代表 GRU 层中的神经元数量。

My question:我的问题:

Given the following GRU layer:给定以下 GRU 层:

# assume that hidden_size = 3

class Encoder(nn.Module):
    def __init__(self, src_dictionary_size, hidden_size):
        super(Encoder, self).__init__()
        self.embedding = nn.Embedding(src_dictionary_size, hidden_size)
        self.gru = nn.GRU(input_size = hidden_size, hidden_size = hidden_size)

Assuming a hidden_size of 3, my understanding is that the GRU layer above would have 3 neurons, each which accepts an input vector of size 3 simultaneously for every timestep.假设 hidden_​​size 为 3,我的理解是上面的 GRU 层将有 3 个神经元,每个神经元在每个时间步同时接受一个大小为 3 的输入向量。

My question is : why do the arguments to hidden_size and input_size have to be equal?我的问题是:为什么hidden_​​sizeinput_size的参数必须相等? Ie why can't each of the 3 neurons accept say, an input vector of size 5?即为什么 3 个神经元中的每一个都不能接受大小为 5 的输入向量?

Case in point: both of the following produce size mismatch:举个例子:以下两种情况都会导致尺寸不匹配:

self.gru = nn.GRU(input_size = hidden_size, hidden_size = hidden_size-1)
self.gru = nn.GRU(input_size = hidden_size, hidden_size = hidden_size+1)

[1] Géron, Aurélien. [1] Géron,Aurélien。 Hands-On Machine Learning with Scikit-Learn and TensorFlow (p. 388).使用 Scikit-Learn 和 TensorFlow 进行机器学习实践(第 388 页)。 O'Reilly Media.奥莱利媒体。 Kindle Edition. Kindle版。

[3] https://pytorch.org/docs/stable/nn.html#torch.nn.GRU [3] https://pytorch.org/docs/stable/nn.html#torch.nn.GRU


Adding full code for reproducibility:添加完整代码以实现可重复性:

import torch
import torch.nn as nn

class Encoder(nn.Module):
    def __init__(self, src_dictionary_size, hidden_size):
        super(Encoder, self).__init__()
        self.hidden_size = hidden_size
        self.embedding = nn.Embedding(src_dictionary_size, hidden_size)
        self.gru = nn.GRU(input_size = hidden_size, hidden_size = hidden_size-1)

    def forward(self, pad_seqs, seq_lengths, hidden):
        """
        Args:
          pad_seqs of shape (max_seq_length, batch_size, 1): Padded source sequences.
          seq_lengths: List of sequence lengths.
          hidden of shape (1, batch_size, hidden_size): Initial states of the GRU.

        Returns:
          outputs of shape (max_seq_length, batch_size, hidden_size): Padded outputs of GRU at every step.
          hidden of shape (1, batch_size, hidden_size): Updated states of the GRU.
        """
        embedded_sqs = self.embedding(pad_seqs).squeeze(2)
        packed_sqs = pack_padded_sequence(embedded_sqs, seq_lengths)
        packed_output, h_n = self.gru(packed_sqs, hidden)
        output, input_sizes = pad_packed_sequence(packed_output)

        return output, h_n

    def init_hidden(self, batch_size=1):
        return torch.zeros(1, batch_size, self.hidden_size)

def test_Encoder_shapes():
    hidden_size = 5
    encoder = Encoder(src_dictionary_size=5, hidden_size=hidden_size)

    # maximum word count
    max_seq_length = 4

    # num sentences
    batch_size = 2
    hidden = encoder.init_hidden(batch_size=batch_size)

    # these are padded sequences (sentences of words). There are 2 sentences (i.e. 2 batches) with a maximum of 4 words.
    pad_seqs = torch.tensor([
        [1, 2],
        [2, 3],
        [3, 0],
        [4, 0]
    ]).view(max_seq_length, batch_size, 1)

    outputs, new_hidden = encoder.forward(pad_seqs=pad_seqs, seq_lengths=[4, 2], hidden=hidden)
    assert outputs.shape == torch.Size([4, batch_size, hidden_size]), f"Bad outputs.shape: {outputs.shape}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape: {new_hidden.shape}"
    print('Success')

test_Encoder_shapes()

I just resolved this and the mistake was self-inflicted.我刚刚解决了这个问题,错误是自己造成的。

Conclusion : input_size and hidden_size can differ in size and there is no inherent problem with this.结论input_sizehidden_​​size 的大小可以不同,这没有固有的问题。 The premises in the question are correctly stated.问题中的前提是正确陈述的。

The problem with the (full) code above was that the initial hidden state of the GRU did not have the correct dimensions.上面(完整)代码的问题是 GRU 的初始隐藏状态没有正确的维度。 The initial hidden state must have the same dimensions as subsequent hidden states.初始隐藏状态必须与后续隐藏状态具有相同的维度。 In my case, the initial hidden state had the shape of (1,2,5) instead of (1,2,4).就我而言,初始隐藏状态的形状为 (1,2,5) 而不是 (1,2,4)。 In the former, 5 represents the dimensionality of the embedding vector.前者中,5表示嵌入向量的维数。 4 represents the hidden_size (num neurons) in the GRU. 4 表示 GRU 中的 hidden_​​size(神经元数量)。 The correct code is below:正确的代码如下:

import torch
import torch.nn as nn

class Encoder(nn.Module):
    def __init__(self, src_dictionary_size, input_size, hidden_size):
        super(Encoder, self).__init__()
        self.hidden_size = hidden_size
        self.embedding = nn.Embedding(src_dictionary_size, input_size)
        self.gru = nn.GRU(input_size = input_size, hidden_size = hidden_size)

    def forward(self, pad_seqs, seq_lengths, hidden):
        """
        Args:
          pad_seqs of shape (max_seq_length, batch_size, 1): Padded source sequences.
          seq_lengths: List of sequence lengths.
          hidden of shape (1, batch_size, hidden_size): Initial states of the GRU.

        Returns:
          outputs of shape (max_seq_length, batch_size, hidden_size): Padded outputs of GRU at every step.
          hidden of shape (1, batch_size, hidden_size): Updated states of the GRU.
        """
        embedded_sqs = self.embedding(pad_seqs).squeeze(2)
        packed_sqs = pack_padded_sequence(embedded_sqs, seq_lengths)
        packed_output, h_n = self.gru(packed_sqs, hidden)
        output, input_sizes = pad_packed_sequence(packed_output)

        return output, h_n

    def init_hidden(self, batch_size=1):
        return torch.zeros(1, batch_size, self.hidden_size)

def test_Encoder_shapes():
    hidden_size = 4
    embedding_size = 5
    encoder = Encoder(src_dictionary_size=5, input_size = embedding_size, hidden_size = hidden_size)
    print(encoder)

    max_seq_length = 4
    batch_size = 2
    hidden = encoder.init_hidden(batch_size=batch_size)
    pad_seqs = torch.tensor([
        [1, 2],
        [2, 3],
        [3, 0],
        [4, 0]
    ]).view(max_seq_length, batch_size, 1)

    outputs, new_hidden = encoder.forward(pad_seqs=pad_seqs, seq_lengths=[4, 2], hidden=hidden)
    assert outputs.shape == torch.Size([4, batch_size, hidden_size]), f"Bad outputs.shape: {outputs.shape}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape: {new_hidden.shape}"
    print('Success')

test_Encoder_shapes()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Tensorflow上训练更改输入大小RNN - Training changing input size RNN on Tensorflow Keras中RNN模型的input_size是多少 - What's the input_size for the RNN Model in Keras Pytorch 双向 RNN 不工作:运行时错误:预期隐藏 [0] 大小 (2, 76, 6),得到 (2, 500, 6) - Pytorch BiDirectional RNN not working: RuntimeError: Expected hidden[0] size (2, 76, 6), got (2, 500, 6) 批量大小的字符 RNN 分类 - Char RNN classification with batch size 如何设置RNN的输出大小? - How to set the output size of an RNN? 无法将我的输入序列和窗口大小转换为 RNN 模型的一组输入/输出对 - Unable to transform my input series and window-size into a set of input/output pairs for the RNN model Keras/Python - 如果 RNN 是有状态的,则必须提供完整的 input_shape(包括批量大小) - Keras/Python - If a RNN is stateful, a complete input_shape must be provided (including batch size) Tensorflow:替换为tf.nn.rnn_cell._linear(输入,大小,0,范围) - Tensorflow: Replacement for tf.nn.rnn_cell._linear(input, size, 0, scope) 为什么 tensorlow LSTM 只将隐藏大小作为输入? - Why does tensorlow LSTM takes only the hidden size as input? 张量流(python)的GRUCell中输入和隐藏状态的大小应该是多少? - What should be the size of input and hidden state in GRUCell of tensorflow (python)?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM