简体   繁体   English

如何使用 tf.keras 在 RNN 中应用层归一化?

[英]How do you apply layer normalization in an RNN using tf.keras?

I would like to apply layer normalization to a recurrent neural network using tf.keras.我想使用 tf.keras 将层归一化应用于循环神经网络。 In TensorFlow 2.0, there is a LayerNormalization class in tf.layers.experimental , but it's unclear how to use it within a recurrent layer like LSTM , at each time step (as it was designed to be used).在TensorFlow 2.0,有一个LayerNormalizationtf.layers.experimental ,但目前还不清楚如何像复发性层使用它LSTM ,在每个时间步骤(因为它被设计成可以使用)。 Should I create a custom cell, or is there a simpler way?我应该创建一个自定义单元格,还是有更简单的方法?

For example, applying dropout at each time step is as easy as setting the recurrent_dropout argument when creating an LSTM layer, but there is no recurrent_layer_normalization argument.例如,在每个时间步应用 dropout 就像在创建LSTM层时设置recurrent_dropout参数一样简单,但没有recurrent_layer_normalization参数。

You can create a custom cell by inheriting from the SimpleRNNCell class, like this:您可以通过继承SimpleRNNCell类来创建自定义单元格,如下所示:

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.activations import get as get_activation
from tensorflow.keras.layers import SimpleRNNCell, RNN, Layer
from tensorflow.keras.layers.experimental import LayerNormalization

class SimpleRNNCellWithLayerNorm(SimpleRNNCell):
    def __init__(self, units, **kwargs):
        self.activation = get_activation(kwargs.get("activation", "tanh"))
        kwargs["activation"] = None
        super().__init__(units, **kwargs)
        self.layer_norm = LayerNormalization()
    def call(self, inputs, states):
        outputs, new_states = super().call(inputs, states)
        norm_out = self.activation(self.layer_norm(outputs))
        return norm_out, [norm_out]

This implementation runs a regular SimpleRNN cell for one step without any activation , then it applies layer norm to the resulting output, then it applies the activation .此实现在没有任何activation情况下运行常规SimpleRNN单元一步,然后将层范数应用于结果输出,然后应用activation Then you can use it like that:然后你可以这样使用它:

model = Sequential([
    RNN(SimpleRNNCellWithLayerNorm(20), return_sequences=True,
        input_shape=[None, 20]),
    RNN(SimpleRNNCellWithLayerNorm(5)),
])

model.compile(loss="mse", optimizer="sgd")
X_train = np.random.randn(100, 50, 20)
Y_train = np.random.randn(100, 5)
history = model.fit(X_train, Y_train, epochs=2)

For GRU and LSTM cells, people generally apply layer norm on the gates (after the linear combination of the inputs and states, and before the sigmoid activation), so it's a bit trickier to implement.对于 GRU 和 LSTM 单元,人们通常在门上应用层范数(在输入和状态的线性组合之后,在 sigmoid 激活之前),因此实现起来有点棘手。 Alternatively, you can probably get good results by just applying layer norm before applying activation and recurrent_activation , which would be easier to implement.或者,您可以通过在应用activationrecurrent_activation之前应用层范数来获得良好的结果,这会更容易实现。

In tensorflow addons, there's a pre-built LayerNormLSTMCell out of the box.在 tensorflow 插件中,有一个LayerNormLSTMCell用的预构建LayerNormLSTMCell

See this doc for more details.有关更多详细信息,请参阅此文档 You may have to install tensorflow-addons before you can import this cell.您可能必须先安装tensorflow-addons然后才能导入此单元格。

pip install tensorflow-addons

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM