简体繁体 English

Tensorflow：可变序列长度和批次大小

[英]Tensorflow: variable sequence length AND batch size

原文 2018-06-01 13:28:38 8 1 python/ tensorflow/ machine-learning

My dataset consists of sentences. 我的数据集由句子组成。 Each sentence has a variable length and is initially encoded as a sequence of vocabulary indexes, ie. 每个句子的长度都是可变的，最初被编码为一系列词汇索引，即。 a tensor of shape [sentence_len]. 形状为[sentence_len]的张量。 The batch size is also variable. 批次大小也是可变的。

I have grouped sentences of similar lengths into buckets and padded where necessary, to bring each sentence in a bucket to the same length. 我将长度相似的句子分组到各个存储桶中，并在必要时进行填充，以使存储桶中的每个句子具有相同的长度。

How could I deal with having both an unknown sentence length AND batch size? 如何处理未知的句子长度和批量大小？

My data provider would tell me what the sentence length is at every batch, but I don't know how to feed that -> the graph is already built at that point. 我的数据提供者会告诉我每批的句子长度是多少，但是我不知道该如何提供->此时该图已经建立。 The input is represented with a placeholder x = tf.placeholder(tf.int32, shape=[batch_size, sentence_length], name='x') . 输入用占位符x = tf.placeholder(tf.int32, shape=[batch_size, sentence_length], name='x') 。 I can turn batch_size or sentence_length to None , but not both. 我可以将batch_size或sentence_length batch_size为None ，但不能两者都设置。

UPDATE: in fact, interestingly, I can set both to None , but I get Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. 更新：实际上，有趣的是，我可以将两者都设置为None ，但是我Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. Note: the next layer is an embedding_lookup. 注意：下一层是embedding_lookup。

I'm not sure what this means and how to avoid it. 我不确定这意味着什么以及如何避免它。 I assume it has something to do with using tf.gather later, which I need to use. 我认为这与稍后使用tf.gather ，我需要使用它。 Alternatively is there any other way to achieve what I need? 另外，还有其他方法可以实现我所需要的吗？

Thank you. 谢谢。

1 个解决方案

Unfortunately there is no workaround here unless you provide a tf.Variable() (which is not possible in your case) to the parameter of tf.nn.embedding_lookup() / tf.gather() . 不幸的是，除非您为tf.nn.embedding_lookup() / tf.gather()的parameter提供tf.Variable() （在您的情况下是不可能的tf.Variable() ，否则这里没有解决方法。 This is happening because, When you declare them with a placeholder of shape [None, None] , from tf.gather() function tf.IndexedSlices() become a sparse tensor . 发生这种情况的原因是，当您使用tf.gather()函数tf.IndexedSlices()的形状为[None, None]的占位符声明它们时，它变为sparse tensor 。

I have already done projects facing this warning . 我已经完成了面对此warning项目。 What I can tell you that if there is a tf.nn.dynamic_rnn() next to the embedding_lookup then make the parameter named swap_memory of tf.nn.dynamic_rnn() to True . 我可以告诉你，如果有一个tf.nn.dynamic_rnn()旁边embedding_lookup然后进行参数命名swap_memory的tf.nn.dynamic_rnn()到True 。 Also to avoid OOM or Resource Exhausted error make the batch size smaller (test for different batch size). 另外，为避免OOM或Resource Exhausted error ，请减小批次大小（测试不同的批次大小）。

There are already some good explanation on this. 对此已经有一些很好的解释。 Please refer to the following Question of the Stackoverflow. 请参考以下Stackoverflow问题。

Tensorflow dense gradient explanation? Tensorflow密集梯度解释？