简体   繁体   English

在 keras 中使用带有 LSTM 的 CNN 时,池化层是否强制?

[英]Does pooling layer mandatory when using CNN with LSTM in keras?

I am using CNN+LSTM for some binary classification problem.我正在使用 CNN+LSTM 解决一些二元分类问题。 My code is as follows.我的代码如下。

def create_network():
    model = Sequential()
    model.add(Conv1D(200, kernel_size=2, activation = 'relu', input_shape=(35,6)))
    model.add(Conv1D(200, kernel_size=2, activation = 'relu'))
    model.add(MaxPooling1D(3))
    model.add(LSTM(200, return_sequences=True))
    model.add(LSTM(200, return_sequences=True))
    model.add(LSTM(200))
    model.add(Dense(100))
    model.add(Dropout(0.2))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

When I use the above model I got some bad results.当我使用上面的 model 时,我得到了一些不好的结果。 However, when I remove the layer model.add(MaxPooling1D(3)) the results were some what improved.但是,当我删除model.add(MaxPooling1D(3))层时,结果有所改善。

My questions are as follows.我的问题如下。

  • Is it mandatory to have a pooling layer when cnn is used with lstm (since I am also using a dropout layer)?当 cnn 与 lstm 一起使用时,是否必须有一个池化层(因为我也在使用一个 dropout 层)?
  • If it is mandatory what are the other kinds of pooling layers that you would suggest.如果它是强制性的,那么您建议的其他类型的池化层是什么。

I am happy to provide more details if needed.如果需要,我很乐意提供更多详细信息。

Firstly, you don't have to use a MaxPooling1D layer.首先,您不必使用 MaxPooling1D 层。 MaxPooling here will only reduce the amount of inputs passed on to the LSTM (in this case).这里的 MaxPooling 只会减少传递给 LSTM 的输入量(在这种情况下)。 From a pure technical point of view, LSTMs can work with any sequence length, and keras automatically sets the right amount of input features从纯技术的角度来看,LSTM 可以处理任何序列长度,并且 keras 会自动设置正确数量的输入特征

There are some interesting things going on here though, that you might want to take a look at:不过,这里发生了一些有趣的事情,您可能想看看:

  1. It's hard to say some pooling mechanism would work better than another.很难说某些池化机制会比另一种更好。 However, the intuition is that max pooling works better on inferencing from extreme cases, while average pooling works better on ignoring the extremeties.然而,直觉是最大池在从极端情况下推断时效果更好,而平均池在忽略极端情况时效果更好。

  2. You left the strides implicit, and it should be noted that the default stride value for pooling and convolution layer is different (None vs 1).您隐含了步幅,应该注意的是,池化层和卷积层的默认步幅值是不同的(无与 1)。 This means that comparing the network with and without the max pooling is not exactly comparing apples to apples, as you greatly reduced the amount of data the LSTM layers would get.这意味着比较具有和不具有最大池化的网络并不完全是比较苹果和苹果,因为您大大减少了 LSTM 层将获得的数据量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM