I am building a hurricane track predictor using satellite data. I have a multiple to many output in a multilayer LSTM model, with input and output arrays following the structure [samples[time[features]]]. I have as features of inputs and outputs the coordinates of the hurricane, WS, and other dimensions.
The problem is that the error reduction, and as a consequence, the model predicts always a constant. After reading several posts, I standardized the data, removed some unnecessary layers, but still, the model always predicts the same output.
I think the model is big enough, activation functions make sense, given that the outputs are all within [-1;1]. So my questions are : What am I doing wrong ?
The model is the following:
class Stacked_LSTM():
def __init__(self, training_inputs, training_outputs, n_steps_in, n_steps_out, n_features_in, n_features_out, metrics, optimizer, epochs):
self.training_inputs = training_inputs
self.training_outputs = training_outputs
self.epochs = epochs
self.n_steps_in = n_steps_in
self.n_steps_out = n_steps_out
self.n_features_in = n_features_in
self.n_features_out = n_features_out
self.metrics = metrics
self.optimizer = optimizer
self.stop = EarlyStopping(monitor='loss', min_delta=0.000000000001, patience=30)
self.model = Sequential()
self.model.add(LSTM(360, activation='tanh', return_sequences=True, input_shape=(self.n_steps_in, self.n_features_in,))) #, kernel_regularizer=regularizers.l2(0.001), not a good idea
self.model.add(layers.Dropout(0.1))
self.model.add(LSTM(360, activation='tanh'))
self.model.add(layers.Dropout(0.1))
self.model.add(Dense(self.n_features_out*self.n_steps_out))
self.model.add(Reshape((self.n_steps_out, self.n_features_out)))
self.model.compile(optimizer=self.optimizer, loss='mae', metrics=[metrics])
def fit(self):
return self.model.fit(self.training_inputs, self.training_outputs, callbacks=[self.stop], epochs=self.epochs)
def predict(self, input):
return self.model.predict(input)
Notes 1) In this particular problem, the time series data is not "continuous", because one time serie belongs to a particular hurricane. I have therefore adapted the training and test samples of the time series to each hurricane. The implication of this is that I cannot use the function stateful=True
in my layers because it would then mean that the model doesn't makes any difference between the different hurricanes (if my understanding is correct).
2) No image data, so no convolutionnal model needed.
Few suggestions, based on my experience:
4 layers of LSTM is too much. Stick to two, maximum three.
Don't use relu
as activations for LSTMs.
Do not use BatchNormalization
for time-series.
Other than these, I'd also suggest removing the dense layers between two LSTM layers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.