简体   繁体   English

在训练和预测多类时间序列分类时保存LSTM隐藏状态

[英]Saving LSTM hidden states while training and predicting for multi-class time series classification

I am trying to use an LSTM for multi-class classification of time series data. 我正在尝试将LSTM用于时间序列数据的多类分类。

The training set has dimensions (390, 179), ie 390 objects with 179 time steps each. 训练集的尺寸为(390,179),即390个对象,每个对象具有179个时间步长。

There are 37 possible classes. 有37种可能的课程。

I would like to use a Keras model with just an LSTM and activation layer to classify input data. 我想使用仅具有LSTM和激活层的Keras模型来对输入数据进行分类。

I also need the hidden states for all the training data and test data passed through the model, at every step of the LSTM (not just the final state). 在LSTM的每个步骤中,我还需要通过模型传递的所有训练数据和测试数据的隐藏状态(而不仅仅是最终状态)。

I know return_sequences=True is needed, but I'm having trouble getting dimensions to match. 我知道return_sequences=True是必需的,但我在获取尺寸匹配方面遇到困难。

Below is some code I've tried, but I've tried a ton of other combinations of calls from a motley of stackexchange and git issues. 下面是一些我尝试过的代码,但是我尝试了很多其他来自stackexchange和git问题的调用组合。 In all of them I get some dimension mismatch or another. 在所有这些中,我都遇到了尺寸不匹配或其他问题。

I don't know how to extract the hidden state representations from the model. 我不知道如何从模型中提取隐藏状态表示。

We have X_train.shape = (390, 1, 179) , Y_train.shape = (390, 37) (one-shot binary vectors)/. 我们有X_train.shape = (390, 1, 179)Y_train.shape = (390, 37) (一次性二进制向量)/。

n_units = 8
n_sequence = 179
n_class = 37

x = Input(shape=(1, n_sequence))
y = LSTM(n_units, return_sequences=True)(x)
z = Dense(n_class, activation='softmax')(y)

model = Model(inputs=[x], outputs=[y])
model.compile(loss='categorical_crossentropy', optimizer='adam')

model.fit(X_train, Y_train, epochs=100, batch_size=128)
Y_test_predict = model.predict(X_test, batch_size=128)

This is what the above gives me: 这就是上面给我的:

ValueError: A target array with shape (390, 37) was passed for an output of shape (None, 1, 37) while using as loss 'categorical_crossentropy'. This loss expects targets to have the same shape as the output.

You input shape should like this: (samples, timesteps, features) Where samples are how many sequences you have, timesteps how long are your sequences, and features how many input you wanna input in one timestep. 您输入的形状应如下所示:(样本,时间步长,特征)样本是您有多少个序列,时间步长是序列的时间长短,以及在一个时间步中要输入多少个输入的特征。 If you set return_sequences=True, your label array should have the shape of (samples, timesteps, output features). 如果设置return_sequences = True,则标签数组的形状应为(样本,时间步长,输出特征)。

There didn't seem to be any way to build a working trainable model while also returning the hidden states with return_sequences=True . 似乎没有任何方法可以构建可工作的可训练模型,同时还通过return_sequences=True返回隐藏状态。

The fix I found was to build a predictor model and train it, and save the weights. 我发现的解决方法是建立一个预测器模型并对其进行训练,并节省权重。 Then I built a new model which ended with my LSTM layer, and fed it the trained weights. 然后,我建立了一个新模型,该模型以我的LSTM层结束,并提供了经过训练的权重。 So, using return_sequences=True , I was able to predict on new data and get the data's representations at each hidden state. 因此,使用return_sequences=True ,我能够预测新数据并获得每个隐藏状态的数据表示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM