简体   繁体   English

在keras的LSTM模型中解释get_weight

[英]interpreting get_weight in LSTM model in keras

This is my simple reproducible code: 这是我简单的可复制代码:

from keras.callbacks import ModelCheckpoint
from keras.models import Model
from keras.models import load_model
import keras
import numpy as np

SEQUENCE_LEN = 45
LATENT_SIZE = 20
VOCAB_SIZE = 100

inputs = keras.layers.Input(shape=(SEQUENCE_LEN, VOCAB_SIZE), name="input")
encoded = keras.layers.Bidirectional(keras.layers.LSTM(LATENT_SIZE), merge_mode="sum", name="encoder_lstm")(inputs)
decoded = keras.layers.RepeatVector(SEQUENCE_LEN, name="repeater")(encoded)
decoded = keras.layers.Bidirectional(keras.layers.LSTM(VOCAB_SIZE, return_sequences=True), merge_mode="sum", name="decoder_lstm")(decoded)
autoencoder = keras.models.Model(inputs, decoded)
autoencoder.compile(optimizer="sgd", loss='mse')
autoencoder.summary()

x = np.random.randint(0, 90, size=(10, SEQUENCE_LEN,VOCAB_SIZE))
y = np.random.normal(size=(10, SEQUENCE_LEN, VOCAB_SIZE))
NUM_EPOCHS = 1
checkpoint = ModelCheckpoint(filepath='checkpoint/{epoch}.hdf5')
history = autoencoder.fit(x, y, epochs=NUM_EPOCHS,callbacks=[checkpoint])

and here is my code to have a look at the weights in the encoder layer: 这是我的代码,以查看编码器层中的权重:

for epoch in range(1, NUM_EPOCHS + 1):
    file_name = "checkpoint/" + str(epoch) + ".hdf5"
    lstm_autoencoder = load_model(file_name)
    encoder = Model(lstm_autoencoder.input, lstm_autoencoder.get_layer('encoder_lstm').output)
    print(encoder.output_shape[1])
    weights = encoder.get_weights()[0]
    print(weights.shape)
    for idx in range(encoder.output_shape[1]):
        token_idx = np.argsort(weights[:, idx])[::-1]

here print(encoder.output_shape) is (None,20) and print(weights.shape) is (100, 80) . 这里print(encoder.output_shape)(None,20)print(weights.shape)(100, 80) print(weights.shape) (100, 80)

I understand that get_weight will print the weight transition after the layer. 我知道get_weight将在图层之后打印权重转换。

The part I did not get based on this architecture is 80 . 我没有基于该体系结构得到的部分是80 what is it? 它是什么?

And, are the weights here the weight that connect the encoder layer to the decoder? 并且,这里的权重是将编码器层连接到解码器的weights吗? I meant the connection between encoder and the decoder. 我的意思是编码器和解码器之间的连接。

I had a look at this question here . 我在这里看了这个问题。 as it is only simple dense layers I could not connect the concept to the seq2seq model. 因为它只是简单的密集层,所以我无法将概念连接到seq2seq模型。

Update1 UPDATE1

What is the difference between: encoder.get_weights()[0] and encoder.get_weights()[1] ? encoder.get_weights()[0]encoder.get_weights()[1]之间有什么区别? the first one is (100,80) and the second one is (20,80) like conceptually? 从概念上讲,第一个是(100,80) ,第二个是(20,80)

any help is appreciated:) 任何帮助表示赞赏:)

The encoder as you have defined it is a model, and it consists of two layers: an input layer and the 'encoder_lstm' layer which is the bidirectional LSTM layer in the autoencoder. 如您所定义的encoder是一个模型,它由两层组成:输入层和'encoder_lstm'层,即自动encoder中的双向LSTM层。 So its output shape would be the output shape of 'encoder_lstm' layer which is (None, 20) (because you have set LATENT_SIZE = 20 and merge_mode="sum" ). 因此,它的输出形状将是'encoder_lstm'层的输出形状,即(None, 20) (因为您已将LATENT_SIZE = 20设置LATENT_SIZE = 20 merge_mode="sum" )。 So the output shape is correct and clear. 因此,输出形状正确且清晰。

However, since encoder is a model, when you run encoder.get_weights() it would return the weights of all the layers in the model as a list. 但是,由于encoder是模型,所以当您运行encoder.get_weights() ,它将以列表形式返回模型中所有图层的权重。 The bidirectional LSTM consists of two separate LSTM layers. 双向LSTM由两个单独的LSTM层组成。 Each of those LSTM layers has 3 weights: the kernel, the recurrent kernel and the biases. 这些LSTM层中的每一层都有3个权重:核,递归核和偏差。 So encoder.get_weights() would return a list of 6 arrays, 3 for each of the LSTM layers. 因此, encoder.get_weights()将返回6个数组的列表,每个LSTM层3个。 The first element of this list, as you have stored in weights and is subject of your question, is the kernel of one of the LSTM layers. 正如您所存储的weights一样,该列表的第一个元素是LSTM层之一的内核。 The kernel of an LSTM layer has a shape of (input_dim, 4 * lstm_units) . LSTM层的内核的形状为(input_dim, 4 * lstm_units) The input dimension of 'encoder_lstm' layer is VOCAB_SIZE and its number of units is LATENT_SIZE . 'encoder_lstm'层的输入维为VOCAB_SIZE ,其单位数为LATENT_SIZE Therefore, we have (VOCAB_SIZE, 4 * LATENT_SIZE) = (100, 80) as the shape of kernel. 因此,我们有(VOCAB_SIZE, 4 * LATENT_SIZE) = (100, 80)作为核的形状。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM