[英]Merge multiple CNN models
我正在尝试在这里实施这篇论文
这是我在这里尝试实现的 CNN 架构:
这段文字来自描述层的论文本身;
图 5 中的 CNN 架构以自上而下的方式显示,从开始(顶部)到结束(底部)节点。 ''NL'' 代表 N-gram 长度。 细分是:
到目前为止我尝试过的代码在这里。
model1 = Input((train_vector1.shape[1:]))
#1_1
model1 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)
#1_2
model1 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)
#1_3
model1 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model1)
model1 = BatchNormalization(200)(model1)
model1 = Dropout(0.2)(model1)
model1 = MaxPooling1D(strides=1)(model1)
model1 = Flatten()(model1)
## Second Part
model2 = Input((train_vector1.shape[1:]))
#2_1
model2 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)
#2_2
model2 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)
#2_3
model2 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model2)
model2 = BatchNormalization(200)(model2)
model2 = Dropout(0.2)(model2)
model2 = MaxPooling1D(strides=1)(model2)
model2 = Flatten()(model2)
## Third Part
model3 = Input((train_vector1.shape[1:]))
#3_1
model3 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)
#3_2
model3 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)
#3_3
model3 = Conv1D(200, filters = 200, kernel_size=(1, 100), stride = 1, activation = "relu")(model3)
model3 = BatchNormalization(200)(model3)
model3 = Dropout(0.2)(model3)
model3 = MaxPooling1D(strides=1)(model3)
model3 = Flatten()(model3)
concat_model = Concatenate()([model1, model2, model3])
output = Dense(10, activation='sigmoid')
我只想知道我的实现在这里是否正确,还是我误解了什么? 我是否理解作者在这里试图做的事情?
从该图像中,我认为输入可以在其他层之间共享。 在这种情况下,您将拥有:
input = Input((train_vector1.shape[1:]))
model1 = Conv1D(...)(input)
# ...
model1 = Flatten()(model1)
model2 = Conv1D(...)(input)
# ...
model2 = Flatten()(model2)
model3 = Conv1D(...)(input)
# ...
model3 = Flatten()(model3)
concat_model = Concatenate()([model1, model2, model3])
output = Dense(10, activation='sigmoid')
也很可能卷积不是一维的,而是二维的。 您可以从它说的事实中得到确认:
步幅为 [1 1]
我们处于两个维度中。 MaxPooling
相同。
你还说:
当我运行这段代码时,它说“过滤器”的 arguments 太多。 我在这里做错什么了吗?
让我们来:
model1 = Conv1D(200, filters=train_vector1.shape[0], kernel_size=(1, 100), strides = 1, activation = "relu")(model1)
Conv1D function 接受此 arguments(完整 文档):
tf.keras.layers.Conv1D(
filters,
kernel_size,
strides=1,
...
)
它说 arguments 太多了,因为您正在尝试编写卷积层的神经元数量,但根本没有任何论据,因此您不必这样做。 神经元的数量取决于您设置的其他参数。
BatchNormalization
也是如此。 从文档:
tf.keras.layers.BatchNormalization(
axis=-1,
momentum=0.99,
...
)
没有“神经元数量”的说法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.