简体   繁体   English

keras cnn_lstm输入层不接受1-D输入

[英]keras cnn_lstm input layer not accepting 1-D input

I have sequences of long 1_D vectors (3000 digits) that I am trying to classify. 我有一些长1d向量(3000位)的序列,我试图分类。 I have previously implemented a simple CNN to classify them with relative success: 我之前已经实现了一个简单的CNN来对它们进行相对成功的分类:

def create_shallow_model(shape,repeat_length,stride):
    model = Sequential()
    model.add(Conv1D(75,repeat_length,strides=stride,padding='same', input_shape=shape, activation='relu'))
    model.add(MaxPooling1D(repeat_length))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

However I am looking to improve the performance by stacking an LSTM/ RNN on the end of the network. 但是,我希望通过在网络末端堆叠LSTM / RNN来提高性能。

I am having difficulty with this as I cannot seem to find a way for the network to accept the data. 我对此有困难,因为我似乎找不到网络接受数据的方法。

def cnn_lstm(shape,repeat_length,stride):
    model = Sequential()
    model.add(TimeDistributed(Conv1D(75,repeat_length,strides=stride,padding='same', activation='relu'),input_shape=(None,)+shape))
    model.add(TimeDistributed(MaxPooling1D(repeat_length)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(6,return_sequences=True))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

model=cnn_lstm(X.shape[1:],1000,1)
tprs,aucs=calculate_roc(model,3,100,train_X,train_y,test_X,test_y,tprs,aucs)

But I get the following error: 但是我收到以下错误:

ValueError: Error when checking input: expected time_distributed_4_input to have 4 dimensions, but got array with shape (50598, 3000, 1)

My questions are: 我的问题是:

  1. Is this a correct way of analysing this data? 这是分析这些数据的正确方法吗?

  2. If so, how do I get the network to accept and classify the input sequences? 如果是这样,我如何让网络接受输入序列并对其进行分类?

There is no need to add those TimeDistributed wrappers. 无需添加那些TimeDistributed包装器。 Currently, before adding the LSTM layer, your model looks like this (I have assumed repeat_length=5 and stride=1 ): 目前,在添加LSTM图层之前,您的模型看起来像这样(我假设repeat_length=5stride=1 ):

Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_2 (Conv1D)            (None, 3000, 75)          450       
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 600, 75)           0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 45000)             0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 45001     
=================================================================
Total params: 45,451
Trainable params: 45,451
Non-trainable params: 0
_________________________________________________________________

So if you want to add a LSTM layer, you can put it right after the MaxPooling1D layer like model.add(LSTM(16, activation='relu')) and just remove the Flatten layer. 因此,如果要添加LSTM图层,可以将其放在MaxPooling1D图层之后,如model.add(LSTM(16, activation='relu'))然后删除展Flatten图层。 Now the model looks like this: 现在模型看起来像这样:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_4 (Conv1D)            (None, 3000, 75)          450       
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 600, 75)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 16)                5888      
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 17        
=================================================================
Total params: 6,355
Trainable params: 6,355
Non-trainable params: 0
_________________________________________________________________

If you want you can pass the return_sequences=True argument to the LSTM layer and keep the Flatten layer. 如果需要,可以将return_sequences=True参数传递给LSTM图层并保留Flatten图层。 But only do such a thing after you have tried the first approach and you have gotten poor results, since adding return_sequences=True may not be necessary at all and it only increases your model size and decreases model performance. 但是只有在尝试第一种方法并且结果不佳之后才做这样的事情,因为添加return_sequences=True可能根本不需要,它只会增加模型大小并降低模型性能。


As a side note: why did you change the loss function to sparse_categorical_crossentropy in the second model? 作为旁注:为什么在第二个模型中将损失函数更改为sparse_categorical_crossentropy There is no need to do that since binary_crossentropy would work fine. 没有必要这样做,因为binary_crossentropy可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM