[英]How to implement a 1D convolutional neural network with residual connections and batch-normalization in Keras?
I am trying to develop a 1D convolutional neural network with residual connections and batch-normalization based on the paper Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks , using keras. 我正在尝试基于纸质心脏病专家通过卷积神经网络进行心律失常检测 ,并使用keras开发具有残差连接和批量归一化的一维卷积神经网络。 This is the code so far: 到目前为止,这是代码:
# define model
x = Input(shape=(time_steps, n_features))
# First Conv / BN / ReLU layer
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(x)
y = BatchNormalization()(y)
y = ReLU()(y)
shortcut = MaxPooling1D(pool_size = n_pool)(y)
# First Residual block
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
# Add Residual (shortcut)
y = add([shortcut, y])
# Repeated Residual blocks
for k in range (2,3): # smaller network for testing
shortcut = MaxPooling1D(pool_size = n_pool)(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = add([shortcut, y])
z = BatchNormalization()(y)
z = ReLU()(z)
z = Flatten()(z)
z = Dense(64, activation='relu')(z)
predictions = Dense(classes, activation='softmax')(z)
model = Model(inputs=x, outputs=predictions)
# Compiling
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
# Fitting
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch)
And this is the graph of a simplified model of what I am trying to build. 这是我尝试构建的简化模型的图形。
The model described in the paper uses an incrementing number of filters: 本文中描述的模型使用数量递增的过滤器:
The network consists of 16 residual blocks with 2 convolutional layers per block. 该网络由16个残差块组成,每个块具有2个卷积层。 The convolutional layers all have a filter length of 16 and have 64k filters, where k starts out as 1 and is incremented every 4-th residual block. 卷积层的滤波器长度均为16,具有64k滤波器,其中k从1开始,每第4个残差块递增一次。 Every alternate residual block subsamples its inputs by a factor of 2, thus the original input is ultimately subsampled by a factor of 2^8. 每个备用残差块以2的因子对输入进行二次采样,因此最终以2 ^ 8的因子对原始输入进行二次采样。 When a residual block subsamples the input, the corresponding shortcut connections also subsample their input using a Max Pooling operation with the same subsample factor. 当残差块对输入进行子采样时,相应的快捷方式连接也会使用具有相同子采样因子的最大池操作对它们的输入进行子采样。
But I can only make it work if I use the same number of filters in every Conv1D layer, with k=1, strides=1 and padding=same, without applying any MaxPooling1D. 但是,只有在每个Conv1D层中使用相同数量的滤镜,且k = 1,strides = 1和padding = same且不应用任何MaxPooling1D的情况下,我才能使其工作。 Any changes in these parameters causes a tensor size mismatch and failure to compile with the following error: 这些参数的任何更改都将导致张量大小不匹配,并且编译失败时会出现以下错误:
ValueError: Operands could not be broadcast together with shapes (70, 64) (70, 128)
Does anyone have any idea on how to fix this size mismatch and make it work? 有谁知道如何解决此尺寸不匹配问题并使其正常工作吗?
In addition, if the input has more than one channel (or features) the mismatch is even worst! 此外,如果输入具有多个通道(或功能),则失配甚至更糟! Is there a way to deal with more than one channel? 有没有一种方法可以处理多个渠道?
The issue of tensor shape mismatch should be happening in add([y, shortcut])
layer. 张量形状不匹配的问题应该发生在add([y, shortcut])
层中。 Because of the fact that you are using MaxPooling1D layer, this halves your time-steps by default, which you can change it by using the pool_size
parameter. 由于您使用的是MaxPooling1D层,因此默认情况下这会将您的时间步减半,您可以使用pool_size
参数对其进行更改。 On the other hand, your residual portion is not reducing the time-steps by same amount. 另一方面,您的剩余部分不会减少相同数量的时间步长。 You should apply stride=2
with padding='same'
before adding shortcut
and y
in any one of Conv1D layer (preferably the last one). 在Conv1D图层的任何一层(最好是最后一层)中添加shortcut
和y
之前,应将stride=2
与padding='same'
配合使用。
For reference, you can check out the Resnet code here Keras-applications-github 作为参考,您可以在此处查看Resnet代码Keras -applications-github
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.