简体   繁体   English

在预训练的Keras模型中替换嵌入层

[英]Replacing the embedding layer in a pretrained Keras model

I'm trying to replace the embedding layer in a Keras NLP model. 我正在尝试替换Keras NLP模型中的嵌入层。 I've trained the model for one language, but I would like to transfer it to another language for which I have comparable embeddings. 我已经为一种语言训练了该模型,但是我想将其转换为我可以比较嵌入的另一种语言。 I hope to achieve this by replacing the index-to-embedding mapping for the source language by the index-to-embedding mapping for the target language. 我希望通过用目标语言的索引到嵌入映射替换源语言的索引到嵌入映射来实现这一点。

I've tried to do it like this: 我试图这样做:

from keras.layers import Embedding
from keras.models import load_model

filename = "my_model.h5"
model = load_model(filename)

new_embedding_layer = Embedding(1000, 300, weights=[my_new_embedding_matrix], trainable=False)
new_embedding_layer.build((None, None))
model.layers[0] = new_embedding_layer

When I print out the model summary, this seems to have worked: the new embedding layer has the correct number of parameters (1000*300=300,000): 当我打印出模型摘要时,这似乎奏效了:新的嵌入层具有正确数量的参数(1000 * 300 = 300,000):

_________________________________________________________________
None
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_85 (Embedding)     multiple                  300000    
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               219648    
_________________________________________________________________
dense_1 (Dense)              (None, 23)                2967      
=================================================================
Total params: 522,615
Trainable params: 222,615
Non-trainable params: 300,000

However, when I use the new model to process new examples, nothing seems to have changed: it still accepts input sequences that have values larger than the new vocabulary size of 1000, and returns the same predictions as before. 但是,当我使用新模型处理新示例时,似乎没有什么改变:它仍然接受值大于新词汇量1000的输入序列,并返回与以前相同的预测。

seq = np.array([10000])
model.predict([seq])

I also notice that the output shape of the new embedding layer is "multiple" rather than (None, None, 300). 我还注意到,新嵌入层的输出形状是“多个”,而不是(无,无,300)。 Maybe this is related to the problem? 也许这与问题有关?

Can anyone tell me what I'm missing? 谁能告诉我我所缺少的吗?

If you Embedding layers have the same shape, then you can simply load your model as you did : 如果您嵌入的图层具有相同的形状,则可以像以前一样简单地加载模型:

from keras.models import load_model

filename = "my_model.h5"
model = load_model(filename)

Then, rather than building a new embedding layer, you can simply set the weights of the old one : 然后,您无需设置新的嵌入层,而只需设置旧层的权重即可:

model.layers[idx_of_your_embedding_layer].set_weights(my_new_embedding_matrix)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM