简体   繁体   English

Keras LSTM输入/输出尺寸

[英]Keras LSTM Input/Output Dimension

I am constructing an LSTM predictor with Keras. 我正在用Keras构建LSTM预测器。 My input array is historical price data. 我的输入数组是历史价格数据。 I segment the data into window_size blocks, in order to predict prediction length blocks ahead. 我将数据分割为window_size块,以便预测前面的prediction length块。 My data is a list of 4246 floating point numbers. 我的数据是4246个浮点数的列表。 I seperate my data into 4055 arrays each of length 168 in order to predict 24 units ahead. 我将数据分成4055个数组,每个数组的长度为168,以便预测前面的24个单位。

This gives me an x_train set with dimension (4055,168) . 这给我一个x_train集合,其维度为(4055,168) I then scale my data and try to fit the data but run into a dimension error. 然后,我缩放数据并尝试拟合数据,但遇到尺寸错误。

df = pd.DataFrame(data)
print(f"Len of df: {len(df)}")
min_max_scaler = MinMaxScaler()
H = 24

window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1

x_train = []
y_train = []
for i in range(num_pred_blocks):
    x_train_block = df['C'][i:(i + window_size)]
    x_train.append(x_train_block)
    y_train_block = df['C'][(i + window_size):(i + window_size + H)]
    y_train.append(y_train_block)

LEN = int(len(x_train)*window_size)
x_train = min_max_scaler.fit_transform(x_train)
batch_size = 1

def build_model():
    model = Sequential()
    model.add(LSTM(input_shape=(window_size,batch_size),
                   return_sequences=True,
                   units=num_pred_blocks))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

num_epochs = epochs
model= build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

The error being returned is as such. 这样返回的错误就是这样。

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 4055 arrays: [array([[0.00630006],

Am I not segmenting correctly? 我无法正确分割吗? Loading correctly? 加载正确吗? Should the number of units be different than the number of prediction blocks? 单元数应该与预测块数不同吗? I appreciate any help. 感谢您的帮助。 Thanks. 谢谢。

Edit 编辑

The suggestions to convert them to Numpy arrays is correct but MinMixScalar() returns a numpy array. 将它们转换为Numpy数组的建议是正确的, MinMixScalar()返回一个numpy数组。 I reshaped the arrays into the proper dimension but now my computer is having CUDA memory error. 我将阵列重塑为适当的尺寸, 但是现在我的计算机出现CUDA内存错误。 I consider the problem solved. 我认为问题已经解决。 Thank you. 谢谢。

df = pd.DataFrame(data)
min_max_scaler = MinMaxScaler()
H = prediction_length

window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1

x_train = []
y_train = []
for i in range(num_pred_blocks):
    x_train_block = df['C'][i:(i + window_size)].values
    x_train.append(x_train_block)
    y_train_block = df['C'][(i + window_size):(i + window_size + H)].values
    y_train.append(y_train_block)

x_train = min_max_scaler.fit_transform(x_train)
y_train = min_max_scaler.fit_transform(y_train)
x_train = np.reshape(x_train, (len(x_train), 1, window_size))
y_train = np.reshape(y_train, (len(y_train), 1, H))
batch_size = 1

def build_model():
    model = Sequential()
    model.add(LSTM(batch_input_shape=(batch_size, 1, window_size),
                   return_sequences=True,
                   units=100))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

num_epochs = epochs
model = build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

I don't think you passed the batch size in the model. 我认为您没有在模型中通过批次大小。

input_shape=(window_size,batch_size) is the data dimension. input_shape=(window_size,batch_size)是数据维度。 which is correct, but you should use input_shape=(window_size, 1) 正确,但是您应该使用input_shape=(window_size, 1)

If you want to use batch, you have to add another dimension, like this LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) (Cited from the Keras) 如果要使用批处理,则必须添加另一个尺寸,例如LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) (从LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]))引用)

in your case: 在您的情况下:

def build_model():
    model = Sequential()
    model.add(LSTM(input_shape=(batch_size, 1, window_size),
                   return_sequences=True,
                   units=num_pred_blocks))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

You also need to use np.shape to change the dimension of the of your data, it should be ( batch_dim , data_dim_1 , data_dim_2 ). 您还需要使用np.shape更改数据的尺寸,它应该是( batch_dimdata_dim_1data_dim_2 )。 I use numpy , so numpy.reshape() will work. 我使用numpy ,所以numpy.reshape()可以工作。

First your data should be row-wise, so for each row, you should have a shape of (1, 168) , then add the batch dimension, it will be (batch_n, 1, 168) . 首先,您的数据应按行排列,因此对于每一行,您都应具有(batch_n, 1, 168) (1, 168)的形状,然后添加批处理维度,它将是(batch_n, 1, 168)

Hope this help. 希望对您有所帮助。

That's probably because x_train and y_train were not updated to numpy arrays. 这可能是因为x_trainy_train没有更新为numpy数组。 Take a closer look at this issue on github. 在github上仔细看看这个问题

model = build_model()
x_train, y_train = np.array(x_train), np.array(y_train)
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM