如何在自定义 Keras 模型函数中共享层权重

Question

我想在连体模型的两侧共享权重。

给定两个输入集，每个输入集都应该通过具有相同权重（连体部分）的完全相同的模型函数。 然后将两个输出连接在一起作为输出。

我已经了解了如何在文档 ( https://keras.io/getting-started/functional-api-guide/#shared-layers ) 中共享特定层以及该板上的其他问题。 有用。

但是当我创建自己的多层模型函数时，Keras 不会共享权重。

这是一个最小的例子：

from keras.layers import Input, Dense, concatenate
from keras.models import Model

# Define inputs
input_a = Input(shape=(16,), dtype='float32')
input_b = Input(shape=(16,), dtype='float32')

# My simple model
def my_model(x):
    x = Dense(128, input_shape=(x.shape[1],), activation='relu')(x)
    x = Dense(128, activation='relu')(x)
    return x

# Instantiate model parameters to share
processed_a = my_model(input_a)
processed_b = my_model(input_b)

# Concatenate output vector
final_output = concatenate([processed_a, processed_b], axis=-1)

model = Model(inputs=[input_a, input_b], outputs=final_output)

这个模型，如果共享的话，一共应该有（16*128+128）+（128*128+128）个参数=18688个参数。 如果我们检查这个：

model.summary()

这表明我们有双重：

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_3 (InputLayer)            (None, 16)           0                                            
__________________________________________________________________________________________________
input_4 (InputLayer)            (None, 16)           0                                            
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 128)          2176        input_3[0][0]                    
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 128)          2176        input_4[0][0]                    
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 128)          16512       dense_5[0][0]                    
__________________________________________________________________________________________________
dense_8 (Dense)                 (None, 128)          16512       dense_7[0][0]                    
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 256)          0           dense_6[0][0]                    
                                                                 dense_8[0][0]                    
==================================================================================================
Total params: 37,376
Trainable params: 37,376
Non-trainable params: 0
__________________________________________________________________________________________________

我不确定我做错了什么。 这是一个简化的例子。 我的示例首先加载一个预训练的语言模型并将输入的文本编码/处理为向量，然后应用这个 siamese 模型。 由于是预训练模型，因此最好将模型放在像这样的单独函数中。

谢谢。

Answer 1

问题是，当您调用my_model您正在创建全新的层（即每次都初始化一个Dense层）。 你想要做的只是初始化每一层一次。 这看起来像：

from keras.layers import Input, Dense, concatenate
from keras.models import Model

# Define inputs
input_a = Input(shape=(16,), dtype='float32')
input_b = Input(shape=(16,), dtype='float32')

# Instantiate model parameters to share
layer1 = Dense(128, input_shape=(input_a.shape[1],), activation='relu')
layer2 = Dense(128, activation='relu')
processed_a = layer2(layer1(input_a))
processed_b = layer2(layer1(input_b))

# Concatenate output vector
final_output = concatenate([processed_a, processed_b], axis=-1)

model = Model(inputs=[input_a, input_b], outputs=final_output)

现在model.summary()给出：

Model: "model_2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_5 (InputLayer)            (None, 16)           0                                            
__________________________________________________________________________________________________
input_6 (InputLayer)            (None, 16)           0                                            
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 128)          2176        input_5[0][0]                    
                                                                 input_6[0][0]                    
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 128)          16512       dense_5[0][0]                    
                                                                 dense_5[1][0]                    
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 256)          0           dense_6[0][0]                    
                                                                 dense_6[1][0]                    
==================================================================================================
Total params: 18,688
Trainable params: 18,688
Non-trainable params: 0

编辑：如果您只想在函数内部创建一次图层，则应该可以使用以下内容

# Instantiate model parameters to share
def my_model(x):
    return Sequential([Dense(128, input_shape=(x.shape[1],), activation='relu'),
                      Dense(128, activation='relu')])
# create sequential model (and layers) only once
model = my_model(input_a)
processed_a = model(input_a)
processed_b = model(input_b)

如何在自定义 Keras 模型函数中共享层权重

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-01-03 00:34:17

如何在自定义 Keras 模型函数中共享层权重

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-01-03 00:34:17

解决方案1
1 已采纳 2020-01-03 00:34:17