如何在张量流中反馈RNN输出到输入

Question

In case where suppose I have a trained RNN (eg language model), and I want to see what it would generate on its own, how should I feed its output back to its input? 如果假设我有一个训练有素的RNN（例如语言模型），并且我想看看它自己会产生什么， 我应该如何将其输出反馈给它的输入？

I read the following related questions: 我阅读了以下相关问题：

Theoretically it is clear to me, that in tensorflow we use truncated backpropagation, so we have to define the max step which we would like to "trace". 理论上我很清楚，在tensorflow中我们使用截断的反向传播，所以我们必须定义我们想要“追踪”的最大步骤。 Also we reserve a dimension for batches, therefore if I'd like to train a sine wave, I have to feed [None, num_step, 1] inputs. 我们还为批量保留了一个维度，因此如果我想训练一个正弦波，我必须输入[None, num_step, 1]输入。

The following code works: 以下代码有效：

tf.reset_default_graph()
n_samples=100

state_size=5

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.)
def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None]
zero_x = np.zeros(n_samples)[None, :, None]
X = tf.placeholder_with_default(zero_x, [None, n_samples, 1])
output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64)

pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh)

Y = np.roll(def_x, 1)
loss = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)


opt = tf.train.AdamOptimizer().minimize(loss)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# Initial state run
plt.show(plt.plot(output.eval()[0]))
plt.plot(def_x.squeeze())
plt.show(plt.plot(pred.eval().squeeze()))

steps = 1001
for i in range(steps):
    p, l, _= sess.run([pred, loss, opt])

The state size of the LSTM can be varied, also I experimented with feeding sine wave into the network and zeros, and in both cases it converged in ~500 iterations. LSTM的状态大小可以变化，我也尝试将正弦波馈入网络和零，并且在两种情况下它都在~500次迭代中收敛。 So far I have understood that in this case the graph consists n_samples number of LSTM cells sharing their parameters, and it is only up to me that I feed input to them as a time series . 到目前为止，我已经了解到，在这种情况下，图表包含n_samples共享其参数的LSTM单元格数量，我只能将输入作为时间序列提供给它。 However when generating samples the network is explicitly depending on its previous output - meaning that I cannot feed the unrolled model at once. 但是，在生成样本时，网络明确取决于其先前的输出 - 这意味着我无法立即提供展开的模型。 I tried to compute the state and output at every step: 我尝试在每一步计算状态和输出：

with tf.variable_scope('sine', reuse=True):
    X_test = tf.placeholder(tf.float64)
    X_reshaped = tf.reshape(X_test, [1, -1, 1])
    output, last_states = tf.nn.dynamic_rnn(lstm_cell, X_reshaped, dtype=tf.float64)
    pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh)


    test_vals = [0.]
    for i in range(1000):
        val = pred.eval({X_test:np.array(test_vals)[None, :, None]})
        test_vals.append(val)

However in this model it seems that there is no continuity between the LSTM cells. 然而，在该模型中，似乎LSTM细胞之间没有连续性。 What is going on here? 这里发生了什么？

Do I have to initialize a zero array with ie 100 time steps, and assign each run's result into the array? 我是否必须使用100个时间步骤初始化零数组，并将每个运行的结果分配给数组？ Like feeding the network with this: 就像喂网络一样：

run 0: input_feed = [0, 0, 0 ... 0]; res1 = result 运行0： input_feed = [0, 0, 0 ... 0]; res1 = result input_feed = [0, 0, 0 ... 0]; res1 = result

run 1: input_feed = [res1, 0, 0 ... 0]; res2 = result 运行1： input_feed = [res1, 0, 0 ... 0]; res2 = result input_feed = [res1, 0, 0 ... 0]; res2 = result

run 1: input_feed = [res1, res2, 0 ... 0]; res3 = result 运行1： input_feed = [res1, res2, 0 ... 0]; res3 = result input_feed = [res1, res2, 0 ... 0]; res3 = result

etc... 等等...

What to do if I want to use this trained network to use its own output as its input in the following time step? 如果我想使用这个训练有素的网络在下一个时间步骤中使用自己的输出作为输入，该怎么办？

Answer 1

If I understood you correctly, you want to find a way to feed the output of time step t as input to time step t+1 , right? 如果我理解正确，你想找到一种方法来输出时间步t的输出作为时间步t+1输入，对吧？ To do so, there is a relatively easy work around that you can use at test time : 为此，您可以在测试时使用相对简单的工作：

Make sure your input placeholders can accept a dynamic sequence length, ie the size of the time dimension is None . 确保输入占位符可以接受动态序列长度，即时间维度的大小为None 。
Make sure you are using tf.nn.dynamic_rnn (which you do in the posted example). 确保您使用的是tf.nn.dynamic_rnn （您在发布的示例中执行此操作）。
Pass the initial state into dynamic_rnn . 将初始状态传递给dynamic_rnn 。
Then, at test time, you can loop through your sequence and feed each time step individually (ie max sequence length is 1). 然后，在测试时，您可以遍历序列并单独为每个时间步进给（即最大序列长度为1）。 Additionally, you just have to carry over the internal state of the RNN. 此外，您只需要继承RNN的内部状态。 See pseudo code below (the variable names refer to your code snippet). 请参阅下面的伪代码（变量名称引用您的代码段）。

Ie, change the definition of the model to something like this: 即，将模型的定义更改为以下内容：

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.)
X = tf.placeholder_with_default(zero_x, [None, None, 1])  # [batch_size, seq_length, dimension of input]
batch_size = tf.shape(self.input_)[0]
initial_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)
def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None]
zero_x = np.zeros(n_samples)[None, :, None]
output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64,
    initial_state=initial_state)
pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh)

Then you can perform inference like so: 然后你可以这样执行推理：

fetches = {'final_state': last_state,
           'prediction': pred}

toy_initial_input = np.array([[[1]]])  # put suitable data here
seq_length = 20  # put whatever is reasonable here for you

# get the output for the first time step
feed_dict = {X: toy_initial_input}
eval_out = sess.run(fetches, feed_dict)
outputs = [eval_out['prediction']]
next_state = eval_out['final_state']

for i in range(1, seq_length):
    feed_dict = {X: outputs[-1],
                 initial_state: next_state}
    eval_out = sess.run(fetches, feed_dict)
    outputs.append(eval_out['prediction'])
    next_state = eval_out['final_state']

# outputs now contains the sequence you want

Note that this can also work for batches, however it can be a bit more complicated if you sequences of different lengths in the same batch. 请注意，这也适用于批次，但如果您在同一批次中使用不同长度的序列，则可能会更复杂一些。

If you want to perform this kind of prediction not only at test time, but also at training time, it is also possible to do, but a bit more complicated to implement. 如果您不仅要在测试时进行此类预测，还要在训练时进行此类预测，也可以这样做，但实现起来要复杂一些。

Answer 2

You can use its own output (last state) as the next-step input (initial state). 您可以使用自己的输出（最后一个状态）作为下一步输入（初始状态）。 One way to do this is to: 一种方法是：

use zero-initialized variables as the input state at every time step 在每个时间步使用零初始化变量作为输入状态
each time you completed a truncated sequence and got some output state, update the state variables with this output state you just got. 每次完成截断序列并获得一些输出状态时，请使用刚刚获得的输出状态更新状态变量。

The second can be done by either: 第二个可以通过以下任一方式完成：

fetching the states to python and feeding them back next time, as done in the ptb example in tensorflow/models 将状态提取到python并在下次将它们反馈回来，就像在tensorflow / models中的ptb示例中所做的那样
build an update op in the graph and add a dependency, as done in the ptb example in tensorpack . 在图中构建更新操作并添加依赖关系，如在tensorpack中的ptb示例中所做的那样。

Answer 3

I know I'm a bit late to the party but I think this gist could be useful: 我知道我有点迟到了，但我认为这个要点可能有用：

https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31 https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31

It lets you autofeed the input through a filter and back into the network as input. 它允许您通过过滤器自动输入输入并作为输入返回到网络。 To make shapes match up processing can be set as a tf.layers.Dense layer. 要使形状匹配，可以将processing设置为tf.layers.Dense图层。

Please ask any questions! 请问任何问题！

Edit: 编辑：

In your particular case, create a lambda which performs the processing of the dynamic_rnn outputs into your character vector space. 在您的特定情况下，创建一个lambda，它将dynamic_rnn输出处理到您的字符向量空间中。 Ex: 例如：

# if you have:
W = tf.Variable( ... )
B = tf.Variable( ... )
Yo, Ho = tf.nn.dynamic_rnn( cell , inputs , state )
logits = tf.matmul(W, Yo) + B
 ...
# use self_feeding_rnn as
process_yo = lambda Yo: tf.matmul(W, Yo) + B
Yo, Ho = self_feeding_rnn( cell, seed, initial_state, processing=process_yo)

如何在张量流中反馈RNN输出到输入

问题描述

3 个解决方案

解决方案1
6 2017-12-19 13:51:34

解决方案2
1 2017-02-24 17:32:12

解决方案3
1 2019-02-01 19:32:28

如何在张量流中反馈RNN输出到输入

问题描述

3 个解决方案

解决方案1 6 2017-12-19 13:51:34

解决方案2 1 2017-02-24 17:32:12

解决方案3 1 2019-02-01 19:32:28

解决方案1
6 2017-12-19 13:51:34

解决方案2
1 2017-02-24 17:32:12

解决方案3
1 2019-02-01 19:32:28