简体   繁体   English

输入数据的形状和大小,RNN Keras,回归

[英]input data shapes & sizes, RNN Keras, regression

Im having trouble sorting my data into the correct format for RNN with Keras. 我无法使用Keras将数据分类为正确的RNN格式。 I have a csv file with 22 columns, 1344 rows. 我有一个22列1344行的csv文件。 My data is continuous variables recorded at 30min intervals, over a number of weeks. 我的数据是连续变量,以30分钟的间隔记录,持续数周。

i understand that keras requires input in the format (num samples, timesteps, nfeatures) So for my data i saw this as (1344,48,22) (as there are 48 readings in a 24hr period in my data). 我知道keras需要以以下格式输入(数量样本,时间步长,nfeatures),因此对于我的数据,我将其视为(1344,48,22)(因为我的数据在24小时内有48个读数)。

the x data is in the shape (1344,22) when imported from csv. 从csv导入时,x数据的形状(1344,22)。

here is my code: 这是我的代码:

model=Sequential()
model.add(LSTM(21, input_shape=(1344,22),kernel_initializer='normal',activation='relu',return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(19, activation='relu')) #hidden layer 2
model.add(Dropout(0.2))
model.add(Dense(8, activation='relu')) #output layer
model.compile(loss='mean_squared_error', optimizer=optimiser,metrics=['accuracy','mse'])

which resulted in the error Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (1344, 22) 这会导致以下错误:检查输入时出现错误:预期的lstm_1_input具有3个维,但数组的形状为(1344,22)

I tried to make the x data into the correct data by adding a embedding layer. 我尝试通过添加嵌入层将x数据转换为正确的数据。 my code now reads: 我的代码现在写道:

model=Sequential()
model.add(Embedding(input_dim=22,input_length=1344,output_dim=48))
model.add(LSTM(21, input_shape=(1344,22), kernel_initializer='normal',activation='relu',return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(19, activation='relu')) #hidden layer 2
model.add(Dropout(0.2))
model.add(Dense(8, activation='relu')) #output layer
model.compile(loss='mean_squared_error', optimizer=optimiser,metrics=['accuracy','mse'])
history=model.fit(x,y, verbose=0,epochs=150, batch_size=70, validation_split=0.2)

resulting in the error: Error when checking input: expected embedding_1_input to have shape (1344,) but got array with shape (22,). 导致错误:检查输入时出错:预期embedding_1_input具有形状(1344,),但数组的形状为(22,)。

im not sure i have fully understood the embedding layer or the meanings of (num samples. timesteps, nfeatures). 我不确定我是否已经完全理解嵌入层或(数字样本,时间步长,nfeatures)的含义。 could someone explain the meanings of input_dim, input_length and output_dim with reference to my data? 有人可以参考我的数据解释input_dim,input_length和output_dim的含义吗? ive read many other posts on this issue and cant seem to fix the issue applying the problem to my data type! 我已经阅读了很多关于这个问题的其他帖子,似乎无法解决将问题应用到我的数据类型的问题!

many thanks for your help. 非常感谢您的帮助。

You can directly feed the data to the LSTM without using an Embedding layer. 您可以不使用嵌入层而直接将数据提供给LSTM。

1344 rows => So, I assume each row of 22 columns is a reading taken at a time point. 1344行=>因此,我假设22列的每一行都是某个时间点的读数。

For input_shape, there are three parts: 对于input_shape,有三个部分:

input_shape (1,48,22) => batch size = 1, time-steps = 48, input-feature-size = 22. input_shape(1,48,22)=>批量大小= 1,时间步长= 48,输入特征大小= 22。

Batch size is optional. 批量大小是可选的。 'time-steps' is how many past time points you would like to use to make the predictions. “时间步长”是您要用来进行预测的过去时间点。 In the example below, 48 means, the past 24 hours worth of data will be used for prediction. 在下面的示例中,48表示将使用过去24小时的数据进行预测。 So, you have to reshape the 1344 rows of data into something like this: 因此,您必须将1344行数据重塑为如下形式:

1st sample = rows 1 - 48 第一个样本=第1 - 48行

2nd sample = rows 2 - 49 and so on. 第二个样本=第2 - 49行,依此类推。

model.add(LSTM(21, input_shape=(48,22),kernel_initializer='normal',activation='relu', return_sequences=True))

# Other layers remain the same as in your first code snippet

print(model.predict(np.zeros((1,48,22)))) # Feed dummy sample to network
[[0. 0. 0. 0. 0. 0. 0. 0.]]

def create_dataset(dataset, look_back):
    dataX, dataY = [], []
    for i in range(len(dataset)-look_back):
        dataX.append(dataset[i:(i+look_back)]) # all 22 columns for X
        dataY.append(dataset[i + look_back, 0:8]) # first 8 columns for Y, just as an example
    return np.array(dataX), np.array(dataY)

csv_data = np.random.randn(1344,22) # simulate csv data
X, Y = create_dataset(csv_data, 48) 
print(X.shape, Y.shape) # (1296, 48, 22) (1296, 8)
model.fit(X, Y)

Simple example of cosine wave prediction - easy to play around with.The create_dataset function is from this link. 余弦波预测的简单示例-易于使用.create_dataset函数来自此链接。 https://github.com/sachinruk/PyData_Keras_Talk/blob/master/cosine_LSTM.ipynb https://github.com/sachinruk/PyData_Keras_Talk/blob/master/cosine_LSTM.ipynb

Regarding reshaping data: https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/ 关于重塑数据: https//machinelearningmastery.com/reshape-in​​put-data-long-short-term-memory-networks-keras/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM