预测 LSTM model 的股价结果显着高于预期

Question

I have made my LSTM model to estimate next day's stock prices.我已经制作了我的 LSTM model 来估计第二天的股价。 I have used tensorflow and keras.我用过 tensorflow 和 keras。

However, I do not understand why my model's predicted price is almost always 2 or 3 factors higher than the current stock price.但是，我不明白为什么我的模型的预测价格几乎总是比当前股价高 2 或 3 个因子。 Is there anybody who knows what I am doing wrong?有谁知道我做错了什么？

The code is shown below:代码如下所示：

def StockPredictor(stock, startdate, enddate, pricetype):
    
    #Get the stock quote
    df = web.DataReader(stock, data_source = 'yahoo', start=startdate, end=enddate)
    #df = pd.read_csv('StockData/TATA.csv')
    
    #Create a new dataframe with only the price type chosen
    data = df.filter([pricetype])
    dataset = data.values  #convert dataset into a numpy array
    training_data_len = math.ceil(len(dataset) * 0.80) #ik wil 80% van de dataset gebruiken om het LSTM model te trainen (naar boven afronden met math.ceil)
    
    #Scale the data (normalizing imput data) (helps the model)
    scaler = MinMaxScaler(feature_range=(0,1))  #scaled_data allemaal waardes tussen 0 en 1
    scaled_data = scaler.fit_transform(dataset)  #computes min and max values for scaling and transforms data based on these values
    
    #Create the training data set
    #Create the scaled training data set
    train_data = scaled_data[0:training_data_len , :]
    #split data into x_train and y_train datasets
    x_train, y_train = [], []  #x_train independent training feature, y dependent
    for i in range(60, len(train_data)):
        x_train.append(train_data[i-60:i,0])  #bevat de waardes van 60 vorige periodes 
        y_train.append(train_data[i, 0])    #bevat 61e waarde waarvan we willen dat model het voorspelt
    
    #Convert x_train and y_train to numpy arrays
    x_train, y_train = np.array(x_train), np.array(y_train)
    
    #reshape data (LSTM expects data to be 3D in form of no. of samples, no. of timestamps and no. of features) (x_train is now 2D)
    x_train = np.reshape(x_train, (x_train.shape[0],x_train.shape[1], 1)) #reshape tot 3D, x_train.shape[0] = no of rows in 2D x_train, [1] is no of colums van 2D x_train
    
    #Build the LSTM model
    model=Sequential()
    model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
    model.add(LSTM(50, return_sequences= False))
    model.add(Dense(25))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error') #model has loss function and optimizer
    
    #training the model with the fit function
    model.fit(x_train, y_train, batch_size=1, epochs=1) #epoch is no of iterations of the dataset forth and backwarth in neural network
    
    #Create the testing data set
    #Create new array containing scale valuels from index
    test_data = scaled_data[training_data_len - 60: , :]
    #create datasets x_test and y_test
    x_test = []
    y_test = dataset[training_data_len:, :]
    for i in range(60, len(test_data)):
        x_test.append(test_data[i-60:i,0])
        
    #convert data into numpy array
    x_test = np.array(x_test)
    
    #Reshape data (zelfde uitleg als regel 65)
    x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
    
    #Get models predicted price values
    #predictions afhankelijk van x_test moeten zelfde values krijgen als y_test
    predictions = model.predict(x_test) #want predictions to contain same values as y_test
    predictions = scaler.inverse_transform(predictions) #unscale the values
    
    #Get the RMSE (om het model te testen)
    rmse = np.sqrt(np.mean(predictions - y_test)**2)
    rmse
    
    #Plot the data
    train = data[:training_data_len]
    valid = data[training_data_len:]
    valid['Predictions'] = predictions
        
    print('The RMSE for the training model =', rmse)
    
    new_df = df.filter([pricetype])
    #get the last 60 days
    last_60_days = new_df[-60:].values
    last_60_days_scaled = scaler.transform(last_60_days)
    #create empty list
    X_test = []
    #append past 60 days to list
    X_test.append(last_60_days)
    #Convert X_test to numpy array
    X_test = np.array(X_test)
    #reshape to 3D
    X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1],1))
    #Get predicted scaled price
    pred_price = model.predict(X_test)
    #undo scaling
    pred_price = scaler.inverse_transform(pred_price)
    print('Predicted price for the next day is :',pred_price)
    
    return pred_price 

allprices = []
for i in range(10):
    pred_price = StockPredictor()
    allprices.append(pred_price)
    
average_pred_price = sum(allprices) / len(allprices)

Answer 1

You are using a min max scaler between 0 and 1 , where the defined highs are historical.您正在使用介于0和1之间的最小最大缩放器，其中定义的高点是历史记录。 Your LSTM model will predict a new high and when you inverse_transform the prediction, it will likely be higher than the min and max that has been fitted to the scaler.您的 LSTM inverse_transform将预测一个新的高点，当您对预测进行逆转换时，它可能会高于已安装到缩放器的最小值和最大值。

Therefore the scaler is the likely culprit that is resulting in your predictions being a factor of 2x higher.因此，缩放器可能是导致您的预测高出 2 倍的罪魁祸首。 Using a standard scaler might help, or without scaling at all.使用标准缩放器可能会有所帮助，或者根本不需要缩放。

Side Note边注

LSTMs to predict stock prices simply on price data will not work.仅根据价格数据预测股票价格的 LSTM 是行不通的。

What is likely to happen is that your LSTM model will predict prices with a T+1 lag - predicting the price with a 1 day lag.可能发生的情况是您的 LSTM model 将以 T+1 滞后预测价格 - 以 1 天滞后预测价格。

Price data inherently contains noise, brought about by retail traders and especially now with social sentiment trading.价格数据固有地包含由散户交易者带来的噪音，尤其是现在社会情绪交易。 An LSTM is likely to overfit on historical noise and therefore is unrepresentative of future "noises" LSTM 可能会过度拟合历史噪声，因此不能代表未来的“噪声”

For more information on the problem of noise, check out this link - https://www.investopedia.com/articles/trading/06/marketnoise.asp有关噪音问题的更多信息，请查看此链接 - https://www.investopedia.com/articles/trading/06/marketnoise.asp

预测 LSTM model 的股价结果显着高于预期

问题描述

1 个解决方案

解决方案1
0 2021-03-16 18:46:33

Side Note边注

预测 LSTM model 的股价结果显着高于预期

问题描述

1 个解决方案

解决方案1 0 2021-03-16 18:46:33

Side Note边注

解决方案1
0 2021-03-16 18:46:33