简体   繁体   中英

Why do I get a Keras LSTM RNN input_shape error?

I keep getting an input_shape error from the following code.

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM

def _load_data(data):
    """
    data should be pd.DataFrame()
    """
    n_prev = 10
    docX, docY = [], []
    for i in range(len(data)-n_prev):
        docX.append(data.iloc[i:i+n_prev].as_matrix())
        docY.append(data.iloc[i+n_prev].as_matrix())
    if not docX:
        pass
    else:
        alsX = np.array(docX)
        alsY = np.array(docY)
        return alsX, alsY

X, y = _load_data(dframe)
poi = int(len(X) * .8)
X_train = X[:poi]
X_test = X[poi:]
y_train = y[:poi]
y_test = y[poi:]

input_dim = 3

All of the above runs smoothly. This is where it goes wrong.

in_out_neurons = 2
hidden_neurons = 300
model = Sequential()
#model.add(Masking(mask_value=0, input_shape=(input_dim,)))
model.add(LSTM(in_out_neurons, hidden_neurons, return_sequences=False, input_shape=(len(full_data),)))
model.add(Dense(hidden_neurons, in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, validation_split=0.05)

It returns this error.

Exception: Invalid input shape - Layer expects input ndim=3, was provided with input shape (None, 10320)

When I check the website it says to specify a tuple "(eg (100,) for 100-dimensional inputs)."

That being said, my data set consists of one column with a length of 10320. I assume that that means that I should be putting (10320,) in as the input_shape, but I get the error anyways. Does anyone have a solution?

My understanding is that both your input and your output are one dimensional vectors. The trick is to reshape them per Keras requirements:

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
import numpy as np

X= np.random.rand(1000)
y = 2*X

poi = int(len(X) * .8)
X_train = X[:poi]
y_train = y[:poi]

X_test = X[poi:]
y_test = y[poi:]

# you have to change your input shape (nb_samples, timesteps, input_dim)
X_train = X_train.reshape(len(X_train), 1, 1)
# and also the output shape (note that the output *shape* is 2 dimensional)
y_train = y_train.reshape(len(y_train), 1)


#in_out_neurons = 2 
in_out_neurons = 1

hidden_neurons = 300
model = Sequential()
#model.add(Masking(mask_value=0, input_shape=(input_dim,)))
model.add(LSTM(hidden_neurons, return_sequences=False, batch_input_shape=X_train.shape))
# only specify the output dimension
model.add(Dense(in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, validation_split=0.05)

# calculate test set MSE
preds = model.predict(X_test).reshape(len(y_test))
MSE = np.mean((preds-y_test)**2)

Here are the key points:

  • when you add your first layer, you are required to specify the number of hidden nodes, and your input shape. Consequent layers don't require the input shape as they can infer it from the hidden nodes of the previous layer
  • Similarly, for your output layer you only specify the number of output nodes

Hope this helps.

Some more information: when using RNN (like LSTM) with sequences of variable length you have to take of the format of your data.

When you group sequences in order to pass it to the fit method, keras will try to build a matrix of samples, which implies that all input sequences must have the same size, otherwise you won't have a matrix of the correct dimension.

There several possible solutions:

  1. train your network using samples one by one (using fit_generator for example)
  2. pad all your sequences so they have the same size
  3. group sequences by size (eventually padding them) and train your network group by group (again using generator based fit)

The third solution corresponds to the most common strategy with variable size. And if you pad sequences (second or third solution) you may want to add a masking layer as input.

If you're not sure, try to print the shape of your data (using the shape attribute of the numpy array.)

You may need to look at: https://keras.io/preprocessing/sequence/ (pad_sequences) and https://keras.io/layers/core/#masking

Following is the working version with Keras 2.0.0 , Modified radix's code

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
import numpy as np

X= np.random.rand(1000)
y = 2 * X

poi = int(len(X) * .8)
X_train = X[:poi]
y_train = y[:poi]

X_test = X[poi:]
y_test = y[poi:]

# you have to change your input shape (nb_samples, timesteps, input_dim)
X_train = X_train.reshape(len(X_train), 1, 1)
# and also the output shape (note that the output *shape* is 2 dimensional)
y_train = y_train.reshape(len(y_train), 1)

# Change test data's dimension also.
X_test = X_test.reshape(len(X_test),1,1)
y_test = y_test.reshape(len(y_test),1)


#in_out_neurons = 2
in_out_neurons = 1

hidden_neurons = 300
model = Sequential()
# model.add(Masking(mask_value=0, input_shape=(input_dim,)))
# Remove batch_input_shape and add input_shape = (1,1) - Imp change for Keras 2.0.0
model.add(LSTM(hidden_neurons, return_sequences=False, input_shape=(X_train.shape[1],X_train.shape[2])))
# only specify the output dimension
model.add(Dense(in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.summary()
model.fit(X_train, y_train, epochs=10, validation_split=0.05)

# calculate test set MSE
preds = model.predict(X_test).reshape(len(y_test))
print(preds)
MSE = np.mean((preds-y_test)**2)
print('MSE ', MSE)

Try to use the LSTM layer without specifying the input shape. Let Keras do the work for you. I think you commented the masking as well because your getting similar issue. I faced it before and it turns out the input_shape = (time_steps, input_dim). I think this happens due to the new automatic shape inference in Keras.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM