[英]Input Shape Keras RNN
I'm working with a time-series data, that has shape of 2000x1001 , where 2000 is the number of cases, 1000 rows represent the data in time-domain, displacements in X direction during 1 sec period, meaning that the timestep is 0.001.我正在处理一个时间序列数据,它的形状为2000x1001 ,其中 2000 是案例数,1000 行代表时域中的数据,1 秒内 X 方向的位移,这意味着时间步长为 0.001 . The last column represents the speed, the output value that I need to predict based on the displacements during 1 sec.
最后一列代表速度,即我需要根据 1 秒内的位移预测的输出值。 How the Input Data should be shaped for RNN in Keras ?
如何在Keras 中为RNN塑造输入数据? I've gone trough some tutorials, but still I'm cofused about Input Shape in RNN.
我已经学习了一些教程,但我仍然对 RNN 中的输入形状感到困惑。 Thanks in advance
提前致谢
#load data training data
dataset=loadtxt("Data.csv", delimiter=",")
x = dataset[:,:1000]
y = dataset[:,1000]
#Create train and test dataset with an 80:20 split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
#input scaling
scaler = StandardScaler()
x_train_s =scaler.fit_transform(x_train)
x_test_s = scaler.transform(x_test)
num_samples = x_train_s.shape[0] ## Number of samples
num_vals = x_train_s.shape[1] # Number of elements in each sample
x_train_s = np.reshape(x_train_s, (num_samples, num_vals, 1))
#create model
model = Sequential()
model.add(LSTM(100, input_shape=(num_vals, 1)))
model.add(Dense(1, activation='relu'))
model.compile(loss='mae', optimizer='adam',metrics = ['mape'])
model.summary()
#training
history = model.fit(x_train_s, y_train,epochs=10, verbose = 1, batch_size =64)
look at this code: it is trying to predict next 4 values based on previous 6 values.看看这段代码:它试图根据前 6 个值预测接下来的 4 个值。 follow the comments and see how very simple input is manipulated for using it as input in rnn/lstm
按照评论,看看非常简单的输入是如何操作的,以将其用作 rnn/lstm 中的输入
follow the comments within code遵循代码中的注释
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow.keras import Model
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RNN, LSTM
"""
creating a toy dataset
lets use this below ```input_sequence``` as the sequence to make data points.
as per the question, we will use 6 points to predict next 4 points
"""
input_sequence = [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10]
X_train = []
y_train = []
**#first 6 points will be our input data points and next 4 points will be data label.
#so on we will shift by 1 and make such data points and label pairs**
for i in range(len(input_sequence)-9):
X_train.append(input_sequence[i:i+6])
y_train.append(input_sequence[i+6:i+10])
X_train = np.array(X_train, dtype=np.float32)
y_train = np.array(y_train, dtype=np.int32)))
**#X_test for the predictions (contains 6 points)**
X_test = np.array([[8,9,10,1,2,3]],dtype=np.float32)
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
**#we will be using basic LSTM, which accepts input in ```[num_inputs, time_steps, data_points], therefore reshaping as per that```**
# so here:
# 1. num_inputs = how many sequence of 6 points you want to use i.e. rows (we use X_train.shape[0] )
# 2. time_steps = batches you can considered i.e. if you want to use 1 or 2 or 3 rows
# 3. data_points = number of points (for ex. in our case its 6 points we are using)
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1]))
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
x_points = X_train.shape[-1]
print("one input contains {} points".format(x_points))
model = Sequential()
model.add(LSTM(4, input_shape=(1, x_points)))
model.add(Dense(4))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
model.fit(X_train, y_train, epochs=500, batch_size=5, verbose=2)
output = list(map(np.ceil, model.predict(X_test)))
print(output)
hope it helps.希望能帮助到你。 ask for any doubt pls.
请提出任何疑问。
Like explained in the doc , Keras expects the following shape for a RNN:就像文档中解释的那样,Keras 期望 RNN 具有以下形状:
(batch_size, timesteps, input_dim)
batch_size
is the umber of samples you feed before a backprop batch_size
是您在反向传播之前提供的样本数量timesteps
is the number of timesteps for each sample timesteps
是每个样本的时间步数input_dim
is the number of features for each timestep input_dim
是每个时间步的特征数EDIT more details:编辑更多细节:
In your case you should go for在你的情况下,你应该去
batch_input_shape = (batch_size, timesteps, 1)
With batch_size
and timesteps
selected as you wish.根据需要选择
batch_size
和timesteps
。
What about the timesteps?时间步长呢?
Let's say you take one of your 2000 samples, and let's say that your sample has 10 elements instead of 1000, for example:假设您取了 2000 个样本中的一个,并且假设您的样本有 10 个元素而不是 1000 个,例如:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Then, if we chose timesteps=3
, then you get a batch of length 8:然后,如果我们选择
timesteps=3
,那么你会得到一批长度为 8 的:
[[[0], [1], [2]],
[[1], [2], [3]],
[[2], [3], [4]],
[[3], [4], [5]],
[[4], [5], [6]],
[[5], [6], [7]],
[[6], [7], [8]],
[[7], [8], [9]]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.