简体   繁体   English

合并多个 CNN 模型

[英]Merge multiple CNN models

I am trying to implement this paper here我正在尝试在这里实施这篇论文

This is the CNN architecture I'm trying to implement here:这是我在这里尝试实现的 CNN 架构:

CNN模型

This text is from the Paper itself that describes the layers;这段文字来自描述层的论文本身;

The CNN architecture in Figure 5 is shown in a top-down manner starting from the start (top) to the finish (bottom) node.图 5 中的 CNN 架构以自上而下的方式显示,从开始(顶部)到结束(底部)节点。 ''NL'' stands for N-gram Length. ''NL'' 代表 N-gram 长度。 The breakdown is:细分是:

  1. An input layer of size 1 × 100 × N where N is the number of instances from the dataset.大小为 1 × 100 × N 的输入层,其中 N 是数据集中的实例数。 Vectors of embedded-words are used as the initial input.嵌入词的向量用作初始输入。
  2. Then the layers between the input and the concatenation is introduced:然后引入输入和连接之间的层:
  3. One convolutional layer with 200 neurons to receive and filter size 1 × 100 × N where N is the number of instances from the dataset.一个具有 200 个神经元的卷积层接收和过滤大小为 1 × 100 × N 的卷积层,其中 N 是数据集中的实例数。 The stride is [1 1].步幅为 [1 1]。
  4. Two convolutional layer with 200 neurons to receive and filter size 1 × 100 × 200. The stride is [1 1].两个卷积层有 200 个神经元来接收和过滤,大小为 1×100×200。步幅为 [1 1]。
  5. Three batch normalization with 200 channels.具有 200 个通道的三批标准化。
  6. Three ReLU activation layers.三个 ReLU 激活层。
  7. Three dropout layers with 20 percent dropout.三个 dropout 层,dropout 为 20%。
  8. A max pooling layer with stride [1 1].具有步幅 [1 1] 的最大池化层。
  9. A depth concatenation layer to concatenate all the last max pooling layers.一个深度连接层,用于连接所有最后的最大池化层。
  10. A fully connected layer with ten neurons.具有十个神经元的全连接层。

The code that I have tried so far is here.到目前为止我尝试过的代码在这里。

model1 = Input((train_vector1.shape[1:]))
#1_1
model1 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)
#1_2
model1 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)
#1_3
model1 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)

model1 = MaxPooling1D(strides=1)(model1)
model1 = Flatten()(model1)

## Second Part

model2 = Input((train_vector1.shape[1:]))
#2_1
model2 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)
#2_2
model2 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)
#2_3
model2 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)

model2 = MaxPooling1D(strides=1)(model2)
model2 = Flatten()(model2)

## Third Part

model3 = Input((train_vector1.shape[1:]))
#3_1
model3 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)
#3_2
model3 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)
#3_3
model3 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)

model3 = MaxPooling1D(strides=1)(model3)
model3 = Flatten()(model3)

concat_model = Concatenate()([model1, model2, model3])
output = Dense(10, activation='sigmoid')

I just want to know if my implementation is correct here, or am I misinterpreting something?我只想知道我的实现在这里是否正确,还是我误解了什么? Am I understanding what the author is trying to do here?我是否理解作者在这里试图做的事情?

From that image I think that the input could be shared among the other layers.从该图像中,我认为输入可以在其他层之间共享。 In that case you would have:在这种情况下,您将拥有:

input = Input((train_vector1.shape[1:]))

model1 = Conv1D(...)(input)
# ...
model1 = Flatten()(model1)

model2 = Conv1D(...)(input)
# ...
model2 = Flatten()(model2)

model3 = Conv1D(...)(input)
# ...
model3 = Flatten()(model3)

concat_model = Concatenate()([model1, model2, model3])
output = Dense(10, activation='sigmoid')

Also most probably the convolutions are not 1D but 2D.也很可能卷积不是一维的,而是二维的。 You can get confirmation of it from the fact that it says:您可以从它说的事实中得到确认:

The stride is [1 1]步幅为 [1 1]

Se we are in two dimensions.我们处于两个维度中。 Same for MaxPooling . MaxPooling相同。

Also you said:你还说:

when I run this code, it say too many arguments for "filters".当我运行这段代码时,它说“过滤器”的 arguments 太多。 Am I doing anything wrong here?我在这里做错什么了吗?

Let's take:让我们来:

model1 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model1)

The Conv1D function accepts this arguments (full documentation ): Conv1D function 接受此 arguments(完整 文档):

tf.keras.layers.Conv1D(
    filters,
    kernel_size,
    strides=1,
    ...
)

It says too many arguments because you are trying to write the number of neurons of the Convolutional layer, but there is simply no argument for that, so you don't have to.它说 arguments 太多了,因为您正在尝试编写卷积层的神经元数量,但根本没有任何论据,因此您不必这样做。 The number of neurons depends on the other parameters you set.神经元的数量取决于您设置的其他参数。

Same thing also for BatchNormalization . BatchNormalization也是如此。 From the docs :文档

tf.keras.layers.BatchNormalization(
    axis=-1,
    momentum=0.99,
    ...
)

There is no "number of neurons" argument.没有“神经元数量”的说法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM