[英]keras cnn_lstm input layer not accepting 1-D input
I have sequences of long 1_D vectors (3000 digits) that I am trying to classify. 我有一些长1d向量(3000位)的序列,我试图分类。 I have previously implemented a simple CNN to classify them with relative success:
我之前已经实现了一个简单的CNN来对它们进行相对成功的分类:
def create_shallow_model(shape,repeat_length,stride):
model = Sequential()
model.add(Conv1D(75,repeat_length,strides=stride,padding='same', input_shape=shape, activation='relu'))
model.add(MaxPooling1D(repeat_length))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
return model
However I am looking to improve the performance by stacking an LSTM/ RNN on the end of the network. 但是,我希望通过在网络末端堆叠LSTM / RNN来提高性能。
I am having difficulty with this as I cannot seem to find a way for the network to accept the data. 我对此有困难,因为我似乎找不到网络接受数据的方法。
def cnn_lstm(shape,repeat_length,stride):
model = Sequential()
model.add(TimeDistributed(Conv1D(75,repeat_length,strides=stride,padding='same', activation='relu'),input_shape=(None,)+shape))
model.add(TimeDistributed(MaxPooling1D(repeat_length)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(6,return_sequences=True))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
return model
model=cnn_lstm(X.shape[1:],1000,1)
tprs,aucs=calculate_roc(model,3,100,train_X,train_y,test_X,test_y,tprs,aucs)
But I get the following error: 但是我收到以下错误:
ValueError: Error when checking input: expected time_distributed_4_input to have 4 dimensions, but got array with shape (50598, 3000, 1)
My questions are: 我的问题是:
Is this a correct way of analysing this data? 这是分析这些数据的正确方法吗?
If so, how do I get the network to accept and classify the input sequences? 如果是这样,我如何让网络接受输入序列并对其进行分类?
There is no need to add those TimeDistributed
wrappers. 无需添加那些
TimeDistributed
包装器。 Currently, before adding the LSTM layer, your model looks like this (I have assumed repeat_length=5
and stride=1
): 目前,在添加LSTM图层之前,您的模型看起来像这样(我假设
repeat_length=5
且stride=1
):
Layer (type) Output Shape Param #
=================================================================
conv1d_2 (Conv1D) (None, 3000, 75) 450
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 600, 75) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 45000) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 45001
=================================================================
Total params: 45,451
Trainable params: 45,451
Non-trainable params: 0
_________________________________________________________________
So if you want to add a LSTM layer, you can put it right after the MaxPooling1D
layer like model.add(LSTM(16, activation='relu'))
and just remove the Flatten
layer. 因此,如果要添加LSTM图层,可以将其放在
MaxPooling1D
图层之后,如model.add(LSTM(16, activation='relu'))
然后删除展Flatten
图层。 Now the model looks like this: 现在模型看起来像这样:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_4 (Conv1D) (None, 3000, 75) 450
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 600, 75) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 16) 5888
_________________________________________________________________
dense_5 (Dense) (None, 1) 17
=================================================================
Total params: 6,355
Trainable params: 6,355
Non-trainable params: 0
_________________________________________________________________
If you want you can pass the return_sequences=True
argument to the LSTM
layer and keep the Flatten
layer. 如果需要,可以将
return_sequences=True
参数传递给LSTM
图层并保留Flatten
图层。 But only do such a thing after you have tried the first approach and you have gotten poor results, since adding return_sequences=True
may not be necessary at all and it only increases your model size and decreases model performance. 但是只有在尝试第一种方法并且结果不佳之后才做这样的事情,因为添加
return_sequences=True
可能根本不需要,它只会增加模型大小并降低模型性能。
As a side note: why did you change the loss function to sparse_categorical_crossentropy
in the second model? 作为旁注:为什么在第二个模型中将损失函数更改为
sparse_categorical_crossentropy
? There is no need to do that since binary_crossentropy
would work fine. 没有必要这样做,因为
binary_crossentropy
可以正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.