简体   繁体   中英

Training output drops to 0 after fixed timesteps and again retrains in LSTM model

I have the task of predicting the temperature based on 4 inputs. In the data given, the temperature increases non linearly to a certain limit and then reduces. It looks similar to the following graph: Temperature trend

In order to create the LSTM model, I appended 3 data files, scaled the data between 0 and 1 and reshaped the input and output data taking 200 timesteps into account. For input data,it looks as follows

num1=int(len(X)/1000)
X=X[:num1*200].reshape(-1,200,4) """X is input data array of 4 columns"""

Then I built an LSTM model with 3 layers (20 neurons in 1st, 10 in 2nd and 5 in third) + 1 dense output layer. With all the options and callbacks, it looks as follows:

early_stop = EarlyStopping(monitor='loss',mode='min',verbose=1,restore_best_weights=True,patience=600)
def train_model(lrate=0.3e-4):
  model=Sequential()
  model.add(LSTM(20,name='LSTM_20',input_shape=(None,4),activation='relu',return_sequences=True))
  model.add(LSTM(10,name='LSTM_10',activation='relu',return_sequences=True))
  model.add(LSTM(5,name='LSTM_5',activation='relu',return_sequences=True))
  model.add(Dense(1,activation='linear'))
  opt=tf.keras.optimizers.Adam(lr=lrate)
  model.compile(loss='mse',optimizer=opt)
  logdir="C:\\Thesis\\logs"
  logdir =logdir+ datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir,histogram_freq=1)
  model.fit(X,Y,epochs=15000,batch_size=64,callbacks=[early_stop,tensorboard_callback],\
            verbose=1,validation_split=0.1)
  return model

With the flexibility of changing the learning rate, I have tried to train the model as best as possible but the output always comes out as follows: Sample Output from model

On closer inspection, the sudden drop in the output occurs at every 200 timesteps that I have mentioned in the model creation: Magnified model output

I have tried changing the timesteps to other values but this always occurs and I am at a loss to explain this phenomenon. Anyone have ideas/solutions for this? Thanks

The answer is using Stateful LSTM. It has worked for me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM