简体   繁体   English

训练时 RNN-LSTM 精度没有提高

[英]RNN-LSTM accuracy not improving when training

I am trying to use a keras RNN-LSTM algorithm to classify 9 different sitting positions by using sensor data (X/Y/Z-accelerometer data, X/Y/Z-gyroscope data, euler-angles etc.).我正在尝试使用 keras RNN-LSTM 算法通过使用传感器数据(X/Y/Z 加速度计数据、X/Y/Z 陀螺仪数据、欧拉角等)对 9 个不同的坐姿进行分类。

The training data are recorded by placing 3 sensors on 8 test subjects' backs, and making them sit in the 9 different sitting positions.训练数据是通过将 3 个传感器放置在 8 个测试对象的背部,并让他们坐在 9 个不同的坐姿上来记录的。 The 3 first test subjects sat for 2 minutes in each position and the last 5 sat for 3 minutes in each position.前 3 个测试对象在每个 position 中坐了 2 分钟,最后 5 个在每个 position 中坐了 3 分钟。

The different test-subjects recorded these amounts of data (in rows):不同的测试对象记录了这些数据量(按行):

  • Subject 1: 52824主题1:52824
  • Subject 2: 51490主题 2:51490
  • Subject 3: 52019课题 3:52019
  • Subject 4: 80613主题 4:80613
  • Subject 5: 70143主题 5:70143
  • Subject 6: 79231主题 6:79231
  • Subject 7: 16027主题 7:16027
  • Subject 8: 15780主题 8:15780

All the recordings are recorded as CSV files, with timestamps (the first column in each CSV file) starting at 0,0000 up to the duration of the trial.所有记录都记录为 CSV 文件,时间戳(每个 CSV 文件中的第一列)从 0,0000 开始,直到试验持续时间。

The transitions between the phases (where the test subjects are changing sitting positions) are cut out from the dataset, and therefore we have some small "holes" in our time-series.阶段之间的转换(测试对象正在改变坐姿)从数据集中删除,因此我们的时间序列中有一些小“洞”。

Combining all these rows together, our X_train set consists of 418127 rows and 39 columns.将所有这些行组合在一起,我们的X_train集由 418127 行和 39 列组成。

np.shape(x_train) // (418127, 39)

And my y_train set is a one-dimensional array containing the correct label corresponding to the row in the x_train set with the same index.而我的y_train集是一个一维数组,包含正确的 label 对应于x_train集中具有相同索引的行。

np.shape(y_train) // (418127,)

As a RNN-LSTM model requires a 3d-array as input, I made a method for creating a 3d-array from my X-train -set.作为一个 RNN-LSTM model 需要一个 3d 数组作为输入,我制作了一种从我的X-train集创建 3d 数组的方法。


def create_3d_array(array, num_timestamps):

    arr_3d = []
    temp_2d = []
    for i in range(1,len(array)):
        temp_2d.append(array[0])
        if i % NUM_TIMESTAMPS == 0:
            arr_3d.append(temp_2d)
            temp_2d = []
    print("x_train: ", np.shape(arr_3d))
    
    return arr_3d

The shape of the resulting 3d-array is dependent on how long the sequence (# timestamps) is supposed to be.生成的 3d 数组的形状取决于序列(#时间戳)应该有多长。

Question 1: in this particular example, what is a fitting sequence-length for the 3d-array to be given as input to the RNN-LSTM model?问题 1:在这个特定示例中,作为 RNN-LSTM model 输入的 3d 数组的拟合序列长度是多少? Is the model supposed to have long sequences, or can a RNN-LSTM model work just as well on small sequences? model 是否应该具有长序列,或者 RNN-LSTM model 是否可以在小序列上同样工作?

I also one-hot-encoded y_train in order to be able to do multiclass classification.我还对y_train进行了一次热编码,以便能够进行多类分类。

np.shape(x_train_3d) //(20906,20,39)
np.shape(y_train_3d) //(20906, 9)

I have previously tried with a sequence-length of 20, (20 timestamps for each sample) and this model.我之前尝试过使用 20 的序列长度(每个样本 20 个时间戳)和这个 model。 I want to keep it as simple as possible, and therefore implement the model as a RNN-LSTM Vanilla.我想让它尽可能简单,因此将 model 实现为 RNN-LSTM Vanilla。

model = Sequential()
model.add(layers.LSTM(2, activation='relu', input_shape=[x_train.shape[1], x_train.shape[2]]))
model.add(Dense(9, activation='softmax'))
model.compile(optimizer=OPTIM, loss=tf.keras.losses.CategoricalCrossentropy(), metrics=['accuracy'])
model.fit(x=x_train, batch_size=5, y=y_train, epochs=10, shuffle=False)



Epoch 1/10
4182/4182 [==============================] - 28s 7ms/step - loss: 4.9613 - accuracy: 0.7435
Epoch 2/10
4182/4182 [==============================] - 19s 5ms/step - loss: 4.7745 - accuracy: 0.7447
Epoch 3/10
4182/4182 [==============================] - 52s 12ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 4/10
4182/4182 [==============================] - 47s 11ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 5/10
4182/4182 [==============================] - 61s 15ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 6/10
4182/4182 [==============================] - 69s 16ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 7/10
4182/4182 [==============================] - 70s 17ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 8/10
4182/4182 [==============================] - 62s 15ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 9/10
4182/4182 [==============================] - 52s 12ms/step - loss: 4.7731 - accuracy: 0.7447
Epoch 10/10
4182/4182 [==============================] - 58s 14ms/step - loss: 4.7731 - accuracy: 0.7447

The test-data are produced the same way, but with less samples, and with each position recorded for about 20 seconds.测试数据的生成方式相同,但样本较少,每个 position 记录大约 20 秒。

np.shape(x_test) // (8367, 39) 
np.shape(y_test) // (8367,)

The same methods are used on the test-data also, resulting in these shapes:相同的方法也用于测试数据,导致这些形状:


np.shape(x_test) //(418, 20, 39)
np.shape(y_test) //(418, 9)

Model evaluation: Model评测:

model.evaluate(x_test,y_test, batch_size=5)

418/418 [==============================] - 2s 6ms/step - loss: 14.3449 - accuracy: 0.1172
[14.344941139221191, 0.11722487956285477]

Question 2: why is the accuracy not improving, and why the evaluation gives such bad results?问题2:为什么准确率没有提高,为什么评价会给出这么差的结果?

I try to give you some advice since in the past I also found myself with a model that did not improve.我试着给你一些建议,因为过去我也发现自己的 model 没有改善。 First of all, if you see that the model at the training level remains stationary and then at the test level it does not give decent results, you must immediately think about overfitting.首先,如果您看到 model 在训练级别保持静止,然后在测试级别没有给出不错的结果,您必须立即考虑过度拟合。 Probably since you use a Dense (9) I would try to increase the number and size of the LSTM layer, perhaps adding another one with more cells.可能因为您使用 Dense (9),我会尝试增加 LSTM 层的数量和大小,也许会添加另一个具有更多单元的层。 After that I see that you are not setting the learning rate, why don't you try a scheduled learning rate?之后我看到你没有设置学习率,你为什么不尝试预定的学习率? it has given me great results.它给了我很好的结果。 It also uses batches to train in order to avoid overfit.它还使用批次进行训练以避免过度拟合。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM