简体   繁体   English

结合CNN和双向LSTM

[英]Combining CNN and bidirectional LSTM

I am trying to combine CNN and LSTM for image classification.我正在尝试结合 CNN 和 LSTM 进行图像分类。

I tried the following code and I am getting an error.我尝试了以下代码,但出现错误。 I have 4 classes on which I want to train and test.我有 4 个课程,我想对其进行培训和测试。

Following is the code:以下是代码:

from keras.models import Sequential
from keras.layers import LSTM,Conv2D,MaxPooling2D,Dense,Dropout,Input,Bidirectional,Softmax,TimeDistributed


input_shape = (200,300,3)
Model = Sequential()
Model.add(TimeDistributed(Conv2D(
            filters=16, kernel_size=(12, 16), activation='relu', input_shape=input_shape)))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(TimeDistributed(Conv2D(
            filters=24, kernel_size=(8, 12), activation='relu')))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(TimeDistributed(Conv2D(
            filters=32, kernel_size=(5, 7), activation='relu')))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(Bidirectional(LSTM((10),return_sequences=True)))
Model.add(Dense(64,activation='relu'))
Model.add(Dropout(0.5))
Model.add(Softmax(4))
Model.compile(loss='sparse_categorical_crossentropy',optimizer='adam')
Model.build(input_shape)

I am getting the following error:我收到以下错误:

"Input tensor must be of rank 3, 4 or 5 but was {}.".format(n + 2)) ValueError: Input tensor must be of rank 3, 4 or 5 but was 2. “输入张量必须为 3、4 或 5 级,但为 {}。”.format(n + 2)) ValueError:输入张量必须为 3、4 或 5 级但为 2。

I found a lot of problems in the code:我在代码中发现了很多问题:

  1. your data are in 4D so simple Conv2D are ok, TimeDistributed is not needed您的数据是 4D,所以简单的Conv2D没问题,不需要TimeDistributed
  2. your output is 2D so set return_sequences=False in the last LSTM cell你的输出是 2D 所以在最后一个 LSTM 单元中设置return_sequences=False
  3. your last layers are very messy: no need to put a dropout between a layer output and an activation您的最后一层非常混乱:无需在层输出和激活之间放置 dropout
  4. you need categorical_crossentropy and not sparse_categorical_crossentropy because your target is one-hot encoded您需要categorical_crossentropy而不是sparse_categorical_crossentropy因为您的目标是单热编码
  5. LSTM expects 3D data. LSTM 需要 3D 数据。 So you need to pass from 4D (the output of convolutions) to 3D.所以你需要从 4D(卷积的输出)传递到 3D。 There are two possibilities you can adopt: 1) make a reshape (batch_size, H, W * channel);您可以采用两种可能性:1)进行重塑(batch_size,H,W * channel); 2) (batch_size, W, H * channel). 2) (batch_size, W, H * 通道)。 In this way, u have 3D data to use inside your LSTM通过这种方式,您可以在 LSTM 中使用 3D 数据

here a full model example:这是一个完整的模型示例:

def ReshapeLayer(x):
    
    shape = x.shape
    
    # 1 possibility: H,W*channel
    reshape = Reshape((shape[1],shape[2]*shape[3]))(x)
    
    # 2 possibility: W,H*channel
    # transpose = Permute((2,1,3))(x)
    # reshape = Reshape((shape[1],shape[2]*shape[3]))(transpose)
    
    return reshape

model = Sequential()
model.add(Conv2D(filters=16, kernel_size=(12, 16), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Conv2D(filters=24, kernel_size=(8, 12), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Conv2D(filters=32, kernel_size=(5, 7), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Lambda(ReshapeLayer)) # <========== pass from 4D to 3D
model.add(Bidirectional(LSTM(10, activation='relu', return_sequences=False)))
model.add(Dense(nclasses,activation='softmax'))

model.compile(loss='categorical_crossentropy',optimizer='adam')
model.summary()

here the running notebook 这是正在运行的笔记本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM