简体   繁体   English

如何在TensorFlow中重用RNN

[英]How to reuse RNN in TensorFlow

I want to implement a model like DSSM (Deep Semantic Similarity Model). 我想实现像DSSM(Deep Semantic Similarity Model)这样的模型。

I want to train one RNN model and use this model to get three hidden vector for three different inputs, and use these hidden vector to compute loss function. 我想训练一个RNN模型并使用该模型获得三个不同输入的三个隐藏向量,并使用这些隐藏向量来计算损失函数。

I try to code in a variable scope with reuse=None like: 我尝试使用reuse = None在变量范围内进行编码,如:

gru_cell = tf.nn.rnn_cell.GRUCell(size)
gru_cell = tf.nn.rnn_cell.DropoutWrapper(gru_cell,output_keep_prob=0.5)
cell = tf.nn.rnn_cell.MultiRNNCell([gru_cell] * 2, state_is_tuple=True)

embedding = tf.get_variable("embedding", [vocab_size, wordvec_size])
inputs = tf.nn.embedding_lookup(embedding, self._input_data)
inputs = tf.nn.dropout(inputs, 0.5)
with tf.variable_scope("rnn"):
    _, self._states_2 = rnn_states_2[config.num_layers-1] = tf.nn.dynamic_rnn(cell, inputs, sequence_length=self.lengths, dtype=tf.float32)
    self._states_1 = rnn_states_1[config.num_layers-1]
with tf.variable_scope("rnn", reuse=True):
    _, rnn_states_2 = tf.nn.dynamic_rnn(cell,inputs,sequence_length=self.lengths,dtype=tf.float32)
    self._states_2 = rnn_states_2[config.num_layers-1]

I use the same inputs and reuse the RNN model, but when I print 'self_states_1' and 'self_states_2', these two vectors are different. 我使用相同的输入并重用RNN模型,但是当我打印'self_states_1'和'self_states_2'时,这两个向量是不同的。

I use with tf.variable_scope("rnn", reuse=True): to compute 'rnn_states_2' because I want to use the same RNN model like 'rnn_states_1'. 我使用with tf.variable_scope("rnn", reuse=True):计算'rnn_states_2',因为我想使用与'rnn_states_1'相同的RNN模型。

But why I get different hidden vectors with the same inputs and the same model? 但为什么我得到具有相同输入和相同模型的不同隐藏向量?

Where did i go wrong? 我哪里做错了?

Thanks for your answering. 谢谢你的回答。

Update: I find the reason may be the 'tf.nn.rnn_cell.DropoutWrapper' , when I remove the drop out wrapper, the hidden vectors are same, when I add the drop out wrapper, these vector become different. 更新:我发现原因可能是'tf.nn.rnn_cell.DropoutWrapper',当我删除drop out包装器时,隐藏的向量是相同的,当我添加drop out包装器时,这些向量变得不同。

So, the new question is : 所以,新的问题是:

How to fix the part of vector which be 'dropped out' ? 如何修复被“退出”的矢量部分? By setting the 'seed' parameter ? 通过设置'seed'参数?

When training a DSSM, should I fix the drop out action ? 在培训DSSM时,我应该修复辍学行为吗?

If you structure your code to use tf.contrib.rnn.DropoutWrapper , you can set variational_recurrent=True in your wrapper, which causes the same dropout mask to be used at all steps, ie the dropout mask will be constant. 如果构造代码以使用tf.contrib.rnn.DropoutWrapper ,则可以在包装器中设置variational_recurrent=True ,这会导致在所有步骤中使用相同的丢失掩码,即丢失掩码将保持不变。 Is that what you want? 那是你要的吗?

Setting the seed parameter in tf.nn.dropout will just make sure that you get the same sequence of dropout masks every time you run with that seed. tf.nn.dropout设置seed参数只会确保每次使用该种子时都获得相同的丢失掩码序列。 That does not mean the dropout mask will be constant , just that you'll always see the same dropout mask at a particular iteration. 这并不意味着丢失掩码将是恒定的 ,只是在特定的迭代中你总是会看到相同的丢失掩码。 The mask will be different for every iteration. 每次迭代时掩码都是不同的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM