How to train a LSTM Neural Network with multiple independent timeseries?

Question

Lets say I have 3 sensors (Sa, Sb and Sc) which measure daily air temperature during January only (so during Julian days 1 to 31). And suppose I have full datasets for 3 years and days are counted in sequence (1 to 365: first year, 366 to 730: second year...). So my dataset is like:

     -------- ------ -------
    | Sensor | Day  | Value |
    |  Sa    |   01 |  7.2  |
    |  Sa    |   02 |  7.0  |
       ...
    |  Sa    |   31 |  5.9  |
    |  Sa    |  366 |  7.4  |
    |  Sa    |  367 |  7.5  |
       ...
    |  Sa    | 1095 |  5.5  |
    |  Sb    |   01 |  6.9  |
    |  Sb    |   02 |  7.1  |
       ...
    |  Sb    | 1095 |  5.6  |
    |  Sc    |   01 |  6.8  |
       ...
    |  Sc    | 1095 |  4.1  |
     -------- ------ -------

I want to predict the value at time t given t-4 to t-1 (so x size 3, y size 1). As we can see, we have 9 continuous timeseries (from days 1 to 31 for Sa, from days 366 to 730 for Sa... from 1 to 31 for Sb...). How should I organize my training set considering the batch issue in this scenario?

So far, I splitted my data into x/y 2D matrixes considering the 'valid' sequences, it is:

  features_set         labels
 | x1  |  x2 |  x3 |   |   y |
 | 7.2 | 7.0 | 6.9 |   | 6.7 |   (sample 1: for Sa days 1 to 3 -> 4)
 | 7.0 | 6.9 | 6.7 |   | 6.8 |   (sample 2: for Sa days 2 to 4 -> 5)
 ...
 | 5.7 | 5.8 | 5.8 |   | 5.9 |   (sample 31: for Sa days 28 to 30 -> 31)
 | 7.4 | 7.5 | 7.4 |   | 7.3 |   (sample 32: for Sa days 366 to 368 -> 369)
 ...
 | 7.0 | 6.9 | 6.7 |   | 6.8 |   (sample 251: for Sc days 1092 to 1094 -> 1095)

Note that samples 1 to 31 are the classical 'shifted sequences' of the first month of Sa, but there is a 'break' in the temporal sequence between samples 31 and 32 as the sample 31 is part of the first year of measurements and the sample 32 is part of the second year of measurements.

If I train the NN with batch size (N) of 32, my minimum loss function obtained is 0.5. When I reduce the batch size to 8 I get a loss of 0.1 to 0.05. When batch size is 1, I get 0.04 (and this seems to be the minimum that can be obtained).

model = Sequential()
model.add(LSTM(4, input_shape=(features_set.shape[1], 1)))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(features_set, labels, epochs=100, batch_size=N)

(in which features_set is a 251x3x1 float matrix and labels is a 251x1x1 float matrix)

So is the choice of the bigger batch sizes causing samples like 31 and 32 to be batched together? And is this the cause of an worse training result? How to deal with this scenario other than using batch size of 1?

Answer 1

Firstly I would normalise the data between 0 and 1.

See if a smaller learning rate and more epochs help.

You have batches each with size [3,3], as in 3 sensors and 3 time steps. You are training each batch against output of size [1,1]

My guess is that when you include more batches, the error is larger because the model is considering the error from 32 samples instead of 1. I would keep your batch size as it is.

If it helps, this model seems similar to this: https://towardsdatascience.com/predicting-stock-price-with-lstm-13af86a74944

How to train a LSTM Neural Network with multiple independent timeseries?

Question

1 answers

solution1
0 2019-07-03 20:25:19

How to train a LSTM Neural Network with multiple independent timeseries?

Question

1 answers

solution1 0 2019-07-03 20:25:19

solution1
0 2019-07-03 20:25:19