简体   繁体   中英

How to input a classification time series data into LSTM

I want to feed my data into a LSTM network, but can't find any similar question or tutorial. My dataset is something like:

person 1:
    t1 f1 f2 f3
    t2 f1 f2 f3
     ...
    tn f1 f2 f3
.
.
.

person K:
    t1 f1 f2 f3
    t2 f1 f2 f3
     ...
    tn f1 f2 f3

So i have k person and for each person i have a matrix like input. The first column of each row is incremental time stamp (like a time-line, so t1 < t2 ) and other columns are features of person in that time.

In mathematical aspect: i have a (number of example,number of time stamp, number of feature) matrix like (52,20,4) which 52 is number of persons, 20 is number of time stamps for a person and 4 is number of features( 1 column is time stamp and 3 are features)

Each person has a class name. I want to classify this persons into two class using LSTM neural network. My question is how to input this type of data into LSTM in a high level library such as Keras?

Edit: My first attempt is to use this as input_shape in keras, but i get 50% accuracy in binary classification! Is the problem in my dataset or input_shape is wrong?!

LSTM(5,input_shape=(20,4))

You need to represent each person's data with a feature vector and pass this vector into the classifier (eg MLP classifier ). I guess your question might be how to get the feature vector out of raw data? There are many ways to get feature out of time-series data. In your case, LSTM would be an option.

LSTM needs a 3D vector for its input with the shape of [batch_size x time x feature] . As you mentioned in the question, you can feed data into the model with:

model = Sequential()
model.add(LSTM(5, input_shape=(20, 4))
model.add(Dense(2, activation='sigmoid')

1) I guess t and f values vary widely and are not normalized . As a result, the prediction of LSTM is not impressive.

2) Your dataset is relatively small. To find out the issue, overfit the model on a small subset of training data. If you get the accuracy of 100% on training data then it means your LSTM learned to represent feature vectors very well. Otherwise, it implies you do not design a good model or feed data properly.

According to the keras documentation for LSTMs, you're supposed to provide a 3D input shape where the first dimension is the batch size (usually None). So try input_shape = (None, 20, 4). This seems to be a common thing with Keras.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM