Keras LSTM输入/输出尺寸

Question

I am constructing an LSTM predictor with Keras. 我正在用Keras构建LSTM预测器。 My input array is historical price data. 我的输入数组是历史价格数据。 I segment the data into window_size blocks, in order to predict prediction length blocks ahead. 我将数据分割为window_size块，以便预测前面的prediction length块。 My data is a list of 4246 floating point numbers. 我的数据是4246个浮点数的列表。 I seperate my data into 4055 arrays each of length 168 in order to predict 24 units ahead. 我将数据分成4055个数组，每个数组的长度为168，以便预测前面的24个单位。

This gives me an x_train set with dimension (4055,168) . 这给我一个x_train集合，其维度为(4055,168) 。 I then scale my data and try to fit the data but run into a dimension error. 然后，我缩放数据并尝试拟合数据，但遇到尺寸错误。

df = pd.DataFrame(data)
print(f"Len of df: {len(df)}")
min_max_scaler = MinMaxScaler()
H = 24

window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1

x_train = []
y_train = []
for i in range(num_pred_blocks):
    x_train_block = df['C'][i:(i + window_size)]
    x_train.append(x_train_block)
    y_train_block = df['C'][(i + window_size):(i + window_size + H)]
    y_train.append(y_train_block)

LEN = int(len(x_train)*window_size)
x_train = min_max_scaler.fit_transform(x_train)
batch_size = 1

def build_model():
    model = Sequential()
    model.add(LSTM(input_shape=(window_size,batch_size),
                   return_sequences=True,
                   units=num_pred_blocks))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

num_epochs = epochs
model= build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

The error being returned is as such. 这样返回的错误就是这样。

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 4055 arrays: [array([[0.00630006],

Am I not segmenting correctly? 我无法正确分割吗？ Loading correctly? 加载正确吗？ Should the number of units be different than the number of prediction blocks? 单元数应该与预测块数不同吗？ I appreciate any help. 感谢您的帮助。 Thanks. 谢谢。

Edit 编辑

The suggestions to convert them to Numpy arrays is correct but MinMixScalar() returns a numpy array. 将它们转换为Numpy数组的建议是正确的，但 MinMixScalar（）返回一个numpy数组。 I reshaped the arrays into the proper dimension but now my computer is having CUDA memory error. 我将阵列重塑为适当的尺寸，但是现在我的计算机出现CUDA内存错误。 I consider the problem solved. 我认为问题已经解决。 Thank you. 谢谢。

df = pd.DataFrame(data)
min_max_scaler = MinMaxScaler()
H = prediction_length

window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1

x_train = []
y_train = []
for i in range(num_pred_blocks):
    x_train_block = df['C'][i:(i + window_size)].values
    x_train.append(x_train_block)
    y_train_block = df['C'][(i + window_size):(i + window_size + H)].values
    y_train.append(y_train_block)

x_train = min_max_scaler.fit_transform(x_train)
y_train = min_max_scaler.fit_transform(y_train)
x_train = np.reshape(x_train, (len(x_train), 1, window_size))
y_train = np.reshape(y_train, (len(y_train), 1, H))
batch_size = 1

def build_model():
    model = Sequential()
    model.add(LSTM(batch_input_shape=(batch_size, 1, window_size),
                   return_sequences=True,
                   units=100))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

num_epochs = epochs
model = build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

Answer 1

I don't think you passed the batch size in the model. 我认为您没有在模型中通过批次大小。

input_shape=(window_size,batch_size) is the data dimension. input_shape=(window_size,batch_size)是数据维度。 which is correct, but you should use input_shape=(window_size, 1) 正确，但是您应该使用input_shape=(window_size, 1)

If you want to use batch, you have to add another dimension, like this LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) (Cited from the Keras) 如果要使用批处理，则必须添加另一个尺寸，例如LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) （从LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]))引用）

in your case: 在您的情况下：

def build_model():
    model = Sequential()
    model.add(LSTM(input_shape=(batch_size, 1, window_size),
                   return_sequences=True,
                   units=num_pred_blocks))
    model.add(TimeDistributed(Dense(H)))
    model.add(Activation("linear"))
    model.compile(loss="mse", optimizer="rmsprop")
    return model

You also need to use np.shape to change the dimension of the of your data, it should be ( batch_dim , data_dim_1 , data_dim_2 ). 您还需要使用np.shape更改数据的尺寸，它应该是（ batch_dim ， data_dim_1 ， data_dim_2 ）。 I use numpy , so numpy.reshape() will work. 我使用numpy ，所以numpy.reshape()可以工作。

First your data should be row-wise, so for each row, you should have a shape of (1, 168) , then add the batch dimension, it will be (batch_n, 1, 168) . 首先，您的数据应按行排列，因此对于每一行，您都应具有(batch_n, 1, 168) (1, 168)的形状，然后添加批处理维度，它将是(batch_n, 1, 168) 。

Hope this help. 希望对您有所帮助。

Answer 2

That's probably because x_train and y_train were not updated to numpy arrays. 这可能是因为x_train和y_train没有更新为numpy数组。 Take a closer look at this issue on github. 在github上仔细看看这个问题。

model = build_model()
x_train, y_train = np.array(x_train), np.array(y_train)
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)

Keras LSTM输入/输出尺寸

问题描述

Edit 编辑

2 个解决方案

解决方案1
1 已采纳 2019-11-24 18:20:18

解决方案2
1 2019-11-24 18:21:36

Keras LSTM输入/输出尺寸

问题描述

Edit 编辑

2 个解决方案

解决方案1 1 已采纳 2019-11-24 18:20:18

解决方案2 1 2019-11-24 18:21:36

解决方案1
1 已采纳 2019-11-24 18:20:18

解决方案2
1 2019-11-24 18:21:36