如何在不使用 Sequential() 的情况下堆叠 Keras 中的图层？

Question

If I have a keras layer L, and I want to stack N versions of this layer (with different weights) in a keras model, what's the best way to do that?如果我有一个 keras 层 L，并且我想在 keras model 中堆叠该层的 N 个版本（具有不同的权重），那么最好的方法是什么？ Please note that here N is large and controlled by a hyper param.请注意，这里的 N 很大并且由超参数控制。 If N is small then this not a problem (we can just manually repeat a line N times).如果 N 很小，那么这不是问题（我们可以手动重复一行 N 次）。 So let's assume N > 10 for example.因此，让我们假设 N > 10。

If the layer has only one input and one output, I can do something like:如果图层只有一个输入和一个 output，我可以这样做：

m = Sequential()
for i in range(N):
    m.add(L)

But this is not working if my layer actually takes multiple inputs.但如果我的层实际上需要多个输入，这就不起作用了。 For example, if my layer has the form z = L(x, y), and I would like my model to do:例如，如果我的图层具有 z = L(x, y) 的形式，并且我希望我的 model 执行以下操作：

x_1 = L(x_0, y)
x_2 = L(x_1, y)
...
x_N = L(x_N-1, y)

Then Sequential wouldn't do the job.那么 Sequential 将无法完成这项工作。 I think I can subclass a keras model, but I don't know what's the cleanest way to put N layers into the class. I can use a list, for example:我想我可以子类化 keras model，但我不知道将 N 层放入 class 的最干净的方法是什么。我可以使用一个列表，例如：

class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.layers = []
        for i in range(N):
            self.layers.append(L)
    def call(self, inputs):
        x = inputs[0]
        y = inputs[1]
        for i in range(N):
            x = self.layers[i](x, y)
        return x

But this is not ideal, as keras won't recognize these layers (it seems not thinking list of layers as "checkpointables").但这并不理想，因为 keras 不会识别这些层（它似乎没有将层列表视为“checkpointables”）。 For example, MyModel.variables would be empty, and MyModel.Save() won't save anything.例如，MyModel.variables 将为空，而 MyModel.Save() 不会保存任何内容。

I also tried to define the model using the functional API, but it won't work in my case as well.我还尝试使用功能性 API 定义 model，但它在我的情况下也不起作用。 In fact if we do事实上，如果我们这样做

def MyModel():
    input = Input(shape=...)
    output = SomeLayer(input)
    return Model(inputs=input, outputs=output)

It won't run if SomeLayer itself is a custom model (it raises NotImplementedError).如果 SomeLayer 本身是自定义的 model（它会引发 NotImplementedError），它将不会运行。

Any suggestions?有什么建议么？

Answer 1

Not sure if I've got your question right, but I guess that you could use the functional API and concatenate or add layers as it is shown in Keras applications, like, ResNet50 or InceptionV3 to build "non-sequential" networks. 不确定我的问题是否正确，但我想您可以使用功能API并concatenate或add Keras应用程序中的图层，如ResNet50或InceptionV3，以构建“非顺序”网络。

UPDATE UPDATE

In one of my projects, I was using something like this. 在我的一个项目中，我使用的是这样的东西。 I had a custom layer (it was not implemented in my version of Keras, so I've just manually "backported" the code into my notebook). 我有一个自定义图层（它没有在我的Keras版本中实现，所以我只是手动将代码“后移”到我的笔记本中）。

class LeakyReLU(Layer):
    """Leaky version of a Rectified Linear Unit backported from newer Keras 
    version."""

    def __init__(self, alpha=0.3, **kwargs):
        super(LeakyReLU, self).__init__(**kwargs)
        self.supports_masking = True
        self.alpha = K.cast_to_floatx(alpha)

    def call(self, inputs):
        return tf.maximum(self.alpha * inputs, inputs)

    def get_config(self):
        config = {'alpha': float(self.alpha)}
        base_config = super(LeakyReLU, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

    def compute_output_shape(self, input_shape):
        return input_shape

Then, the model: 然后，模型：

def create_model(input_shape, output_size, alpha=0.05, reg=0.001):
    inputs = Input(shape=input_shape)

    x = Conv2D(16, (3, 3), padding='valid', strides=(1, 1), 
               kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
               activation=None)(inputs)
    x = BatchNormalization()(x)
    x = LeakyReLU(alpha=alpha)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)

    x = Conv2D(32, (3, 3), padding='valid', strides=(1, 1),
               kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
               activation=None)(x)
    x = BatchNormalization()(x)
    x = LeakyReLU(alpha=alpha)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)

    x = Conv2D(64, (3, 3), padding='valid', strides=(1, 1),
               kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
               activation=None)(x)
    x = BatchNormalization()(x)
    x = LeakyReLU(alpha=alpha)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)

    x = Conv2D(128, (3, 3), padding='valid', strides=(1, 1),
               kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
               activation=None)(x)
    x = BatchNormalization()(x)
    x = LeakyReLU(alpha=alpha)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)

    x = Conv2D(256, (3, 3), padding='valid', strides=(1, 1),
               kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
               activation=None)(x)
    x = BatchNormalization()(x)
    x = LeakyReLU(alpha=alpha)(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)

    x = Flatten()(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
    x = Dense(output_size, activation='linear', kernel_regularizer=l2(reg))(x)

    model = Model(inputs=inputs, outputs=x)

    return model

Finally, a custom metric: 最后，一个自定义指标：

def root_mean_squared_error(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))

I was using the following snippet to create and compile the model: 我使用以下代码片段来创建和编译模型：

model = create_model(input_shape=X.shape[1:], output_size=y.shape[1])
model.compile(loss=root_mean_squared_error, optimizer='adamax')

As usual, I was using a checkpoint callback to save the model. 像往常一样，我使用检查点回调来保存模型。 To load the model, you need to pass the custom layers classes and metrics as well into load_model function: 要加载模型，您还需要将自定义图层类和指标传递给load_model函数：

def load_custom_model(path):
    return load_model(path, custom_objects={
        'LeakyReLU': LeakyReLU,
        'root_mean_squared_error': root_mean_squared_error
    })

Does it help? 有帮助吗？

Answer 2

After having researched this to a great extent: I'm certain that there is no built-in, universal way to do this in Tensorflow/Keras.在对此进行了大量研究之后：我确信在 Tensorflow/Keras 中没有内置的通用方法来执行此操作。

There are, however, still ways to achieve the same goals but in a different way.然而，仍然有一些方法可以以不同的方式实现相同的目标。 The problem is that there's no universal solution to this in Tensorflow with any Keras layer, and so you'll have to approach it on a case-by-case basis.问题是在 Tensorflow 和任何 Keras 层中没有通用的解决方案，所以你必须根据具体情况来处理它。

So for example, if what you wanted to do is stack a bunch of Dense layers and then have some dimension of your input that would correspond to each one (simple example), what you would instead want to do is construct a custom Dense layer and add extra dimensions to its weights and biases, and then do the appropriate operations given some extra dimension in your input.因此，例如，如果您想要做的是堆叠一堆Dense层，然后让输入的某个维度对应于每个层（简单示例），那么您想要做的是构建一个自定义Dense层和为其权重和偏差添加额外的维度，然后在给定输入中的一些额外维度的情况下执行适当的操作。

So ultimately the same (desired) operations would be performed here in the way that you want them to be, each input along some dimension would be put through a separate Dense layer with separate weights/biases: but it would be done concurrently, without any python looping.所以最终相同的（期望的）操作将按照你希望的方式在这里执行，每个维度上的每个输入都将通过一个单独的Dense层，具有单独的权重/偏差：但它会同时完成，没有任何python 循环。 In essence, you would be reducing the size and complexity of the graph and performing the same operations in a more concurrent way;本质上，您将减少图形的大小和复杂性，并以更并发的方式执行相同的操作； this ought to be much more efficient.这应该更有效率。

The strategy outlined here generalises to any layer/input type.此处概述的策略适用于任何层/输入类型。 It's not great news, in that it would be of very high value to us (users) if there were some standardised Keras-friendly way of stacking a bunch of layers and then passing input to them in a more concurrent way that didn't involve python looping but rather concatenating the internal parameters into a new dimension and managing alignment between an extra 'stacking' dimension in both the inputs and parameters.这不是什么好消息，因为如果有一些标准化的 Keras 友好方式来堆叠一堆层，然后以一种不涉及python 循环，而是将内部参数连接到一个新维度，并在输入和参数的额外“堆叠”维度之间管理 alignment。

Like, in the same way we have tf.keras.Sequential we could also benefit from something like tf.keras.Parallel as a universal solution to this common ML need.就像，以同样的方式我们有tf.keras.Sequential我们也可以从tf.keras.Parallel类的东西中受益，作为这种常见 ML 需求的通用解决方案。

Answer 3

If I understand your question correctly, you can solve this problem simply by using a for-loop when building the model. 如果我正确理解您的问题，您只需在构建模型时使用for循环即可解决此问题。 I'm not sure if you need any special layer so I will assume you only use Dense here: 我不确定你是否需要任何特殊的层，所以我假设你只在这里使用Dense：

def MyModel():
    input = Input(shape=...)
    x = input
    for i in range(N):
        x = Dense(number_of_nodes, name='dense %i' %i)(x)
        // Or some custom layers
    output = Dense(number_of_output)(x)

    return Model(inputs=input, outputs=output)

如何在不使用 Sequential() 的情况下堆叠 Keras 中的图层？

问题描述

3 个解决方案

解决方案1
3 2018-06-26 10:54:05

解决方案2
1 2021-10-06 10:34:55

解决方案3
0 2018-06-27 14:07:52

如何在不使用 Sequential() 的情况下堆叠 Keras 中的图层？

问题描述

3 个解决方案

解决方案1 3 2018-06-26 10:54:05

解决方案2 1 2021-10-06 10:34:55

解决方案3 0 2018-06-27 14:07:52

解决方案1
3 2018-06-26 10:54:05

解决方案2
1 2021-10-06 10:34:55

解决方案3
0 2018-06-27 14:07:52