简体   繁体   中英

LSTM data shape. I need help changing the LSTM to read my DataFrame (or vice versa)

In a previous question, I asked about how to use Sequential() to build an LSTM with a proper parameter count. LSTM parameter count in 4 * ([outputSize * inputSize] + outputSize^2 + outputSize) .

outputSize means if you will return 1 value or a sequence. inputSize is the length of an individual sample/record/observation

Based on this equation, I setup the following LSTM to have 28 parameters as the equation shows

m3 = Sequential()
m3.add(LSTM((1),batch_input_shape=(None,1,5)))
m3.summary()

This was a toy example. I am now building an LSTM for some time-series data that I've already used successfully with an MLP. The data is of type DataFrame as it's made via Pandas. It has 9 lags per observation.

>>> X_train[:5]
          Lag09     Lag08     Lag07  ...     Lag03     Lag02     Lag01
69200  0.450732  0.467834  0.462262  ...  0.471648  0.476497  0.460177
69140  0.467834  0.462262  0.455329  ...  0.476497  0.460177  0.471678
69080  0.462262  0.455329  0.456245  ...  0.460177  0.471678  0.476364
69020  0.455329  0.456245  0.472979  ...  0.471678  0.476364  0.467761
68960  0.456245  0.472979  0.471648  ...  0.476364  0.467761  0.471914

[5 rows x 9 columns]
>>> type(X_train)
<class 'pandas.core.frame.DataFrame'>

the targets look like this

>>> y_train[:5]
69200    0.471678
69140    0.476364
69080    0.467761
69020    0.471914
68960    0.484080
Name: Close, dtype: float64
>>> type(y_train)
<class 'pandas.core.series.Series'>

Using the guide above, about the parameters, I built an LSTM like this

my = Sequential()
my.add(LSTM(20, batch_input_shape=(None,1,9), return_sequences=True))
my.add(LSTM(20, return_sequences=True))
my.add(LSTM(20, return_sequences=True))
my.add(LSTM(1))

The None means I'm not specifying the number of observations

When I try to run the data through it, I get an error on the dimensions

model.fit(X_train, y_train,
                           validation_data=(X_validation,y_validation),
                           epochs=noepoch, verbose=0,
                           shuffle=False)

ValueError: Error when checking input: expected lstm_input to have 3 dimensions, 
but got array with shape (1212, 9)

Is there a problem with how I use .fit() ?
Why are 3 dimensions expected? (similar error occurs if I remove None , or reverse 1,9)
Is it an issue with the DataFrame? (dataframes don't have a.reshape() function)

Solved it.

  1. data shouldn't be a DataFrame

     data_as_array = np.array(dataframe_of_9_columns)
  2. That annoying shape stuff

    data_shape_array = data_as_array.reshape(len,1,9)

This becomes the new X_train. I suppose y_train is fine, as the model did run.

lstm_model.fit(data_shape_array, y_train)

I will try this in my full code and see what happens.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM