简体   繁体   中英

LSTM overfitting but validation accuracy not improving

The task I am trying to do is to classify EEG signals into 4 possible classes. The data is divided up into trials. Subjects were asked to think about doing 1 of four actions, and the classification task is to predict what they were thinking based on the EEG signals.

I have ~2500 trials. For each trial, there are 22 channels of EEG sensor inputs and 1000 time steps. My baseline is a single layer MLP, and I get ~45% validation accuracy.

Since keras LSTM requires one-hot-encoded vectors for y, I mapped 0,1,2,3 to their corresponding one-hot encodings before doing training (y_total_new). At first, I manually created an 80/20 train/test split but then just opted to let keras do the split (validation_split=0.2).

This is my first LSTM experiment ever. I chose 100 units to begin with. I added a fully connected layer with four neurons in order to map to output classes, and used categorical_crossentropy for my loss function. So far with the LSTM, I can't get above 25% validation accuracy. If I run the following code for 50 epochs instead of 3, the LSTM overfits the data but the validation accuracy stays around 0.25.

Since this is my first time using an LSTM, I'm wondering if someone could shed insight into design cues I might have missed or point me in the right direction.

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM


time_steps = 1000
n_features = 22

model = Sequential()
model.add(LSTM(1000, return_sequences=False, input_shape=(time_steps, n_features)))
model.add(Dropout(0.2))
model.add(Dense(22, activation='tanh'))
model.add(Dense(4, activation='sigmoid'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.fit(X, y_total_new, validation_split=0.2, batch_size=16, epochs=50)
#score = model.evaluate(X_test, y_test_new, batch_size=16)

This problem sometimes occurs due to the limited validation dataset. When the validation data is not sufficient it is not possible for the model to improve accuracy upon due to the limited number of data it is validating upon.

Try increasing the validation split to 0.3, to further check where the problem could be coming from. If that is the case you can create your own (more representative) validation datasets and passing them to the model to improve validation accuracy.

If that is the case try applying over-balancing or under-balancing in order to ensure there is equal number of samples for each label.

When working with LSTMs you have to be careful when creating the dataset splits to further allow the model to generate more accurate sequence predictions, especially when the dataset size is small in each trial created.

Have you tried to add convolutional layers as the first layers of your model? I am using right now this approach in order to classify EMG signals into 53 classes. The convolutional layers are supposed to automatically learn the features from the data, and them feed the LSTM layers with them. There are several possible architectures, DeepConvLstm is one of them:

DeepConvLstmArch

DeepConvLstm paper: www.mdpi.com/1424-8220/16/1/115/htm

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM