简体   繁体   English

LSTM算法为所有输入生成相同的结果

[英]LSTM Algorithm Produces Same Results for all Inputs

So, I am currently working on a machine learning algorithm problem pertaining to car speeds and angles, and I'm trying to improve upon some of my work. 所以,我目前正在研究与汽车速度和角度有关的机器学习算法问题,我正在努力改进我的一些工作。 I recently got done with an XGBRegressor that yielded between 88 - 95% accuracy on my cross-validated data. 我最近完成了一个XGBRegressor,在我的交叉验证数据上产生了88-95%的准确率。 However, I'm trying to improve upon it, so I've been looking into the LSTM algorithm, because my data is time-series dependent. 但是,我正在努力改进它,所以我一直在研究LSTM算法,因为我的数据依赖于时间序列。 Essentially, every link includes a steering angle, the previous times steering angle (x-1), the time before that (x-2), and the difference between the current value and the previous value (x - (x-1)). 基本上,每个链路包括转向角,前一个转向角(x-1),之前的时间(x-2),以及当前值和前一个值之间的差(x - (x-1)) 。 The goal is to predict whether or not a value is 'abnormal.' 目标是预测某个值是否“异常”。 For instance, if an angle jumps from .1 to .5 (on a scale of 0-1), this is abnormal. 例如,如果角度从.1跳到.5(在0-1的范围内),则这是异常的。 My previous algorhtm did a great job at classifying whether or not the angles were abnormal. 我以前的algorhtm在分类角度是否异常方面做得很好。 Unfortunatley, my algorithm is prediciting the same value for every single input value. 不幸的是,我的算法是为每个输入值预测相同的值。 For instance, this is what is gives me. 例如,这就是给我的。

test_X = array([[[ 5.86925570e-01,  5.86426251e-01,  5.85832947e-01,
          3.19300000e+03, -5.93304274e-04, -1.09262314e-03]],

       [[ 5.86426251e-01,  5.85832947e-01,  5.85263908e-01,
          3.19400000e+03, -5.69038950e-04, -1.16234322e-03]],

       [[ 5.85832947e-01,  5.85263908e-01,  5.84801158e-01,
          3.19500000e+03, -4.62749993e-04, -1.03178894e-03]],

       ...,

       [[ 4.58070203e-01,  4.57902738e-01,  4.64613980e-01,
          6.38100000e+03,  6.71124195e-03,  6.54377704e-03]],

       [[ 4.57902738e-01,  4.64613980e-01,  7.31314846e-01,
          6.38200000e+03,  2.66700866e-01,  2.73412108e-01]],

       [[ 4.64613980e-01,  7.31314846e-01,  4.68819741e-01,
          6.38300000e+03, -2.62495104e-01,  4.20576175e-03]]])

test_y = array([0, 0, 0, ..., 0, 1, 0], dtype=int64)

yhat = array([[-0.00068355],
       [-0.00068355],
       [-0.00068355],
       ...,
       [-0.00068355],
       [-0.00068355],
       [-0.00068355]], dtype=float32)

I've tried changing the epochs and batch sizes per some of the things I've read online so far. 到目前为止,我已尝试更改我在线阅读的一些内容的时期和批量大小。 Furthermore, I've also tried plotting out some of the features to see if for some reason the algorithm simply doesn't like them, but I can't find anything. 此外,我还试图绘制一些功能,看看是否由于某种原因算法根本不喜欢它们,但我找不到任何东西。 I'm not new to machine learning but I am new to deep learning, so sorry if this is a stupid issue or question. 我不是机器学习的新手,但我不熟悉深度学习,如果这是一个愚蠢的问题,我很抱歉。 Below if the code. 下面是代码。

data = pd.read_csv('final_angles.csv') 
data.dropna(axis=0, subset=['steering_angle'], inplace=True)

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data['steering_angle'] = scaler.fit_transform(data[['steering_angle']])

y = data.flag #Set y to the value we want to predict, the 'flag' value. 
X = data.drop(['flag', 'frame_id'], axis=1) 

X = concat([X.shift(2), X.shift(1), X], axis=1)
X.columns = ['angle-2', 'id2', 'angle-1', 'id1', 'steering_angle', 'id'] 
X = X.drop(['id2', 'id1'], axis=1)  

X['diff'] = 0;
X['diff2'] = 0;
for index, row in X.iterrows():
    if(index <= 1):
        pass;
    else:
        X.loc[index, "diff"] = row['steering_angle'] - X['steering_angle'][index-1] 
        X.loc[index, "diff2"] = row['steering_angle'] - X['steering_angle'][index-2] 

X = X.iloc[2:,]; 
y = y.iloc[2:,];

train_X, test_X, train_y, test_y = train_test_split(X.as_matrix(), y.as_matrix(), test_size=0.5, shuffle=False)
# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))
test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

model = Sequential()
model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
# fit network
history = model.fit(train_X, train_y, epochs=50, batch_size=150, validation_data=(test_X, test_y), verbose=2, shuffle=False)

yhat = model.predict(test_X)

Instead of the predicted values being 而不是预测值

array([[-0.00068355],
       [-0.00068355],
       [-0.00068355],
       ...,
       [-0.00068355],
       [-0.00068355],
       [-0.00068355]], dtype=float32)

I was expecting something more along the lines of 我期待更多的东西

array([-0.00065207, -0.00065207, -0.00065207,  1.0082773 ,  0.01269123,
        0.01873571, -0.00065207, -0.00065207,  0.99916965,  0.002684  ,
       -0.00018287, -0.00065207, -0.00065207, -0.00065207, -0.00065207,
        1.0021645 ,  0.00654274,  0.01044858, -0.0002622 , -0.0002622 ],
      dtype=float32)

which came from the aforementioned XGBRegressor test. 它来自前面提到的XGBRegressor测试。

Any help is appreciated, please let me know if more code/info is needed. 如有任何帮助,请告知我们是否需要更多代码/信息。

Edit: Results of Print Statement 编辑:打印声明的结果

Train on 3190 samples, validate on 3191 samples
Epoch 1/50
 - 5s - loss: 0.4268 - val_loss: 0.2820
Epoch 2/50
 - 0s - loss: 0.2053 - val_loss: 0.1256
Epoch 3/50
 - 0s - loss: 0.1442 - val_loss: 0.1256
Epoch 4/50
 - 0s - loss: 0.1276 - val_loss: 0.1198
Epoch 5/50
 - 0s - loss: 0.1256 - val_loss: 0.1179
Epoch 6/50
 - 0s - loss: 0.1250 - val_loss: 0.1188
Epoch 7/50
 - 0s - loss: 0.1258 - val_loss: 0.1183
Epoch 8/50
 - 1s - loss: 0.1258 - val_loss: 0.1199
Epoch 9/50
 - 0s - loss: 0.1256 - val_loss: 0.1179
Epoch 10/50
 - 0s - loss: 0.1255 - val_loss: 0.1192
Epoch 11/50
 - 0s - loss: 0.1247 - val_loss: 0.1180
Epoch 12/50
 - 0s - loss: 0.1254 - val_loss: 0.1185
Epoch 13/50
 - 0s - loss: 0.1252 - val_loss: 0.1176
Epoch 14/50
 - 0s - loss: 0.1258 - val_loss: 0.1197
Epoch 15/50
 - 0s - loss: 0.1251 - val_loss: 0.1175
Epoch 16/50
 - 0s - loss: 0.1253 - val_loss: 0.1176
Epoch 17/50
 - 0s - loss: 0.1247 - val_loss: 0.1183
Epoch 18/50
 - 0s - loss: 0.1249 - val_loss: 0.1178
Epoch 19/50
 - 0s - loss: 0.1253 - val_loss: 0.1178
Epoch 20/50
 - 0s - loss: 0.1253 - val_loss: 0.1181
Epoch 21/50
 - 0s - loss: 0.1245 - val_loss: 0.1192
Epoch 22/50
 - 0s - loss: 0.1250 - val_loss: 0.1187
Epoch 23/50
 - 0s - loss: 0.1244 - val_loss: 0.1184
Epoch 24/50
 - 0s - loss: 0.1252 - val_loss: 0.1188
Epoch 25/50
 - 0s - loss: 0.1253 - val_loss: 0.1197
Epoch 26/50
 - 0s - loss: 0.1253 - val_loss: 0.1192
Epoch 27/50
 - 0s - loss: 0.1267 - val_loss: 0.1177
Epoch 28/50
 - 0s - loss: 0.1256 - val_loss: 0.1182
Epoch 29/50
 - 0s - loss: 0.1247 - val_loss: 0.1178
Epoch 30/50
 - 0s - loss: 0.1249 - val_loss: 0.1183
Epoch 31/50
 - 0s - loss: 0.1259 - val_loss: 0.1189
Epoch 32/50
 - 0s - loss: 0.1258 - val_loss: 0.1187
Epoch 33/50
 - 0s - loss: 0.1248 - val_loss: 0.1179
Epoch 34/50
 - 0s - loss: 0.1259 - val_loss: 0.1203
Epoch 35/50
 - 0s - loss: 0.1252 - val_loss: 0.1190
Epoch 36/50
 - 0s - loss: 0.1260 - val_loss: 0.1192
Epoch 37/50
 - 0s - loss: 0.1249 - val_loss: 0.1183
Epoch 38/50
 - 0s - loss: 0.1249 - val_loss: 0.1187
Epoch 39/50
 - 0s - loss: 0.1252 - val_loss: 0.1185
Epoch 40/50
 - 0s - loss: 0.1246 - val_loss: 0.1183
Epoch 41/50
 - 0s - loss: 0.1247 - val_loss: 0.1179
Epoch 42/50
 - 0s - loss: 0.1242 - val_loss: 0.1194
Epoch 43/50
 - 0s - loss: 0.1255 - val_loss: 0.1187
Epoch 44/50
 - 0s - loss: 0.1244 - val_loss: 0.1176
Epoch 45/50
 - 0s - loss: 0.1248 - val_loss: 0.1183
Epoch 46/50
 - 0s - loss: 0.1257 - val_loss: 0.1179
Epoch 47/50
 - 0s - loss: 0.1248 - val_loss: 0.1177
Epoch 48/50
 - 0s - loss: 0.1247 - val_loss: 0.1194
Epoch 49/50
 - 0s - loss: 0.1248 - val_loss: 0.1181
Epoch 50/50
 - 0s - loss: 0.1245 - val_loss: 0.1182

One possible problem might be your timestamps . 一个可能的问题可能是您的timestamps You reshaped the input into a shape with timestamps = 1. If we'd like to take advantage of the characteristics of LSTM, the timestamps should be greater than 1, shouldn't it? 您将输入重新整形为timestamps = 1的形状。如果我们想利用LSTM的特性, timestamps应该大于1,不是吗?

If you had steering angle for 3 consecutive time steps for each data point then you might be able to try to make timestamps = 3. 如果每个数据点有3个连续时间步长的转向角,那么您可以尝试使timestamps = 3。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM