简体   繁体   中英

Preparing Pandas DataFrame for LSTM


I'm trying to fit a LSTM classifier using Keras but don't understand how to prepare the data for training.

I currently have two dataframes for the training data. X_train contains 48 hand-crafted temporal features from IMU data, and y_train contains corresponding labels (4 kinds) representing terrain. The shape of these dataframes is given below:

X_train = X_train.values.reshape(X_train.shape[0],X_train.shape[1],1)
print(X_train.shape, y_train.shape)
**(268320, 48, 1) (268320,)**

Model using batch_size = (32,5,48) :

def def_model():
    model = Sequential()
    model.add(LSTM(units=144,batch_size=(32, 5, 48),return_sequences=True))
    model.add(Dropout(0.5)) 
    model.add(Dense(144, activation='relu'))
    model.add(Dropout(0.5))         
    model.add(Dense(4, activation='softmax'))          
    model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])        
    return model

model_LSTM = def_model()

LSTM_history = model_LSTM.fit(X_train, y_train, epochs=15, validation_data=(X_valid, y_valid), verbose=1)

The error that I am getting:
ValueError: Shapes (32, 1) and (32, 48, 4) are incompatible

Any insight into how to fix this particular error and any intuition into what Keras is expecting?

What is the 5 in your batch size? The batch_size argument in the LSTM layer indicates that your data should be in the form (batch_size, time_steps, feature_per_time_step) . If I am understanding correctly, your data has time_steps = 1 and feature_per_time_step = 48 .

Here is a sample of working code and the shape of each of them.

def def_model():
    model = Sequential()
    model.add(LSTM(units=144,batch_size=(32, 1, 48),return_sequences=True))
    model.add(Dropout(0.5)) 
    model.add(Dense(144, activation='relu'))
    model.add(Dropout(0.5))         
    model.add(Dense(4, activation='softmax'))          
    model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])        
    return model

model_LSTM = def_model()

X_train = np.random.random((10000,1,48))
y_train = np.random.random((10000,4))
y_train = y_train.reshape(-1,1,4)

data = tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(32)
model_LSTM.fit(data, epochs=15, verbose=1)

Passing data instead of x_train and y_train in your fit function will fit the model properly.

If you want to have 5 timesteps in your data, you will have to create your X_train in such a way to have it have a shape (n_samples,5,48) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM