如何在LSTM中实现Tensorflow批量规范化

Question

My current LSTM network looks like this. 我目前的LSTM网络看起来像这样。

rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)  # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,              # cell you have chosen
    tf_x,                  # input
    initial_state=init_s,  # the initial hidden state
    time_major=False,      # False: (batch, time step, input); True: (time step, batch, input)
)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(outputs, [-1, CELL_SIZE])
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

Usually, I apply tf.layers.batch_normalization as batch normalization. 通常，我将tf.layers.batch_normalization应用为批量标准化。 But I am not sure if this works in a LSTM network. 但我不确定这是否适用于LSTM网络。

b1 = tf.layers.batch_normalization(outputs, momentum=0.4, training=True)
d1 = tf.layers.dropout(b1, rate=0.4, training=True)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(d1, [-1, CELL_SIZE])                       
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

Answer 1

Based on this paper : "Layer Normalization" - Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton 基于这篇论文： “层规范化” - 吉米雷巴，杰米赖安基洛斯，杰弗里E.辛顿

Tensorflow now comes with the tf.contrib.rnn.LayerNormBasicLSTMCell a LSTM unit with layer normalization and recurrent dropout. Tensorflow现在附带tf.contrib.rnn.LayerNormBasicLSTMCell LSTM单元，具有图层规范化和重复丢失。

Find the documentation here . 在这里找到文档。

Answer 2

If you want to use batch norm for RNN (LSTM or GRU), you can check out this implementation , or read the full description from blog post . 如果您想使用RNN（LSTM或GRU）的批量规范，您可以查看此实现，或阅读博客文章中的完整说明。

However, the layer-normalization has more advantage than batch norm in sequence data. 然而，层序规范化在序列数据中比批量规范更有优势。 Specifically, "the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent networks" (from the paper Ba, et al. Layer normalization ). 具体而言，“批量标准化的效果取决于小批量大小，并且如何将其应用于循环网络并不明显”（来自论文Ba等人，层标准化）。

For layer normalization, it normalizes the summed inputs within each layer. 对于图层标准化，它会对每个图层中的求和输入进行标准化。 You can check out the implementation of layer-normalization for GRU cell: 您可以查看GRU单元的图层规范化的实现：

如何在LSTM中实现Tensorflow批量规范化

问题描述

2 个解决方案

解决方案1
0 2018-12-21 19:44:27

解决方案2
0 2019-08-14 07:13:28

如何在LSTM中实现Tensorflow批量规范化

问题描述

2 个解决方案

解决方案1 0 2018-12-21 19:44:27

解决方案2 0 2019-08-14 07:13:28

解决方案1
0 2018-12-21 19:44:27

解决方案2
0 2019-08-14 07:13:28