繁体   English   中英

在使用tf.nn.bidirectional_dynamic_rnn进行训练时,如何将状态从一批传递到另一批?

[英]How to pass states from one batch to another while training using tf.nn.bidirectional_dynamic_rnn?

我想将一批中计算出的final_state转移到下一批中,以应用截短的反向传播。 我通过占位符来获取状态,但是其类型不同,因此不能直接将其作为初始状态传递。 我要执行此操作。 我是tensorflow的新手,我需要此操作的帮助。

#starting of LSTM
def lstm_cell_f() :
return tf.contrib.rnn.BasicLSTMCell(size,reuse=tf.get_variable_scope().reuse)

def lstm_cell_b() :
return tf.contrib.rnn.BasicLSTMCell(size, reuse=tf.get_variable_scope().reuse)

cell_f = tf.contrib.rnn.MultiRNNCell([lstm_cell_f() for _ in range(num_layers)])
cell_b = tf.contrib.rnn.MultiRNNCell([lstm_cell_b() for _ in range(num_layers)])

initial_state_f = cell_f.zero_state(batch_size, dtype = tf.float32)
initial_state_b = cell_b.zero_state(batch_size, dtype = tf.float32)

def RNN_OUT(da,state_f, state_b) :
   outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_f,cell_b, da, initial_state_fw=initial_state_f, initial_state_bw=initial_state_b, swap_memory=True) 
   return states, tf.concat(outputs, 2)

i_state_f = tf.placeholder(tf.float32, None, name="i_state_f")
i_state_b = tf.placeholder(tf.float32, None, name="i_state_b")

_input = tf.placeholder(tf.float32, shape = [batch_size, num_steps, size], name="input")
 final_states, _output = RNN_OUT(_input, i_state_f, i_state_b)

简而言之,我想使用占位符将final_states返回,然后将其用于正向和反向lstms的initial_states。

我在这个链接上得到了答案。 很抱歉没有完全搜索答案就发布了这个问题。

def RNN_OUT(da,state_f, state_b) :
   outputs, states = tf.nn.bidirectional_dynamic_rnn(cell_f,cell_b, da, initial_state_fw=state_f, initial_state_bw=state_b, swap_memory=True) 

i_state_f = tf.placeholder(tf.float32, shape=[num_layers, 2, batch_size, size], name="i_state_f")
l_f = tf.unstack(i_state_f, axis=0)
rnn_state_f = tuple(
     [tf.contrib.rnn.LSTMStateTuple(l_f[idx][0],l_f[idx][1])
      for idx in range(num_layers)])
i_state_b = tf.placeholder(tf.float32, shape=[num_layers, 2, batch_size, size], name="i_state_b")
l_b = tf.unstack(i_state_b, axis=0)
rnn_state_b = tuple(
     [tf.contrib.rnn.LSTMStateTuple(l_b[idx][0],l_b[idx][1])
      for idx in range(num_layers)])
_input = tf.placeholder(tf.float32, shape = [batch_size, num_steps, size], name="input")
final_states, _output = RNN_OUT(_input, rnn_state_f, rnn_state_b)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM