简体   繁体   English

Keras LSTM上的batch_input_shape元组

[英]batch_input_shape tuple on Keras LSTM

I have the following feature vector that consists in a single feature for each sample and 32 samples at total: 我有以下特征向量,每个样本包含一个特征,总共有32个样本:

X = [[0.1], [0.12], [0.3] ... [0.10]] X = [[0.1],[0.12],[0.3] ...... [0.10]]

and a label vector that consists of binary values 以及由二进制值组成的标签向量

Y = [0, 1, 0 , 0, .... 1] (with 32 samples as well) Y = [0,1,0,0,...... 1](也有32个样本)

I'm trying to use Keras LSTM to predict the next value of the sequence based on a new entry. 我正在尝试使用Keras LSTM根据新条目预测序列的下一个值。 What I can't figure out is what the "batch_input_shape" tuple means for instance: 我无法弄清楚的是“batch_input_shape”元组的含义:

 model.add(LSTM(neurons, batch_input_shape=(?, ?, ?), return_sequences=False, stateful=True))

According to this article the first one is the batch size, but what about the other two? 根据这篇文章,第一个是批量大小,但另外两个呢? Are they the number of features for each sample and the number of samples? 它们是每个样品的特征数量和样品数量吗? What should be the value of batch_size in this case? 在这种情况下,batch_size的值应该是多少?

At the moment receiving the error message: 目前收到错误消息:

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (32, 1)

Edit: Here is the model declaration: 编辑:这是模型声明:

def create_lstm(batch_size, n_samples, neurons, dropout):
model = Sequential()
model.add(LSTM(neurons, batch_size=batch_size, input_shape=(n_samples, 1), return_sequences=False, stateful=True))
model.add(Dropout(dropout))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model

According to this Keras Sequential Model guide on "stateful" LSTM (at the very bottom), we can see what those three elements mean: 上的“状态” LSTM Keras序列模式指南(最底部),我们可以看到这三个要素是指:

Expected input batch shape: ( batch_size , timesteps , data_dim ). 预期的输入批量形状:( batch_sizetimestepsdata_dim )。 Note that we have to provide the full batch_input_shape since the network is stateful . 请注意,我们必须提供完整的batch_input_shape,因为网络是有状态的 the sample of index i in batch k is the follow-up for the sample i in batch k-1. 批次k中的索引i的样本是批次k-1中的样本i的后续。

The first one as you already discovered is the size of the batches to be used during training. 您已经发现的第一个是在训练期间使用的批次大小 How much you should chose depends in part on your specific problem, but mostly is given by the size of your dataset . 您应该选择多少部分取决于您的具体问题,但主要取决于您的数据集大小 If you specify a batch size of x and your dataset contains N samples, during training your data will be split in N/x groups (batches) of size x each. 如果您指定的批量大小x和你的数据集包含N样本,训练你的数据时将被拆分N/x尺寸的组(批次) x每。

Therefore, you probably want your batch size to be smaller than the size of your dataset . 因此,您可能希望批量大小小于数据集的大小 There is no unique value, but you want it to be proportionally smaller (say one or two orders) than all your data. 没有唯一值,但您希望它比所有数据按比例缩小(比如一个或两个订单)。 Some people prefer to use powers of 2 (32, 128, etc.) as their batch sizes. 有些人喜欢使用2(32,128等)的功率作为批量大小。 It is also possible in some cases to not use batches at all, and train with all your data at once (although not necessarily better). 在某些情况下,也可能根本不使用批次,并立即训练所有数据(尽管不一定更好)。

The other two values are the timesteps (the size of your temporal dimension) or "frames" each sample sequence has, and the data dimension (that is, the size of your data vector on each timestep). 另外两个值是时间步长 (时间维度的大小)或每个样本序列具有的“帧”,以及数据维度 (即每个时间步长上数据向量的大小)。

For example, say your input sequences look like X = [[0.54, 0.3], [0.11, 0.2], [0.37, 0.81]] . 例如,假设您的输入序列看起来像X = [[0.54, 0.3], [0.11, 0.2], [0.37, 0.81]] We can see that this sequence has a timestep of 3 and a data_dim of 2 . 我们可以看到该序列的时间步长3data_dim2

So, the ValueError you are getting is most probably due to this (the error even hints that it expected 3 dims). 所以,你得到的ValueError很可能是由于这个(错误甚至暗示它预期3个暗淡)。 Also, make sure your array is a Numpy Array. 此外,请确保您的阵列是Numpy阵列。

As a last comment, given that you say you have 32 samples total (that is your whole dataset contains 32 samples) I consider is too few data to be using batches; 作为最后一条评论,假设您说您总共有32个样本(即您的整个数据集包含32个样本),我认为数据太少而无法使用批次; usually the minimum batch size I have seen is 32, so consider obtaining more data before trying to use batch training. 通常,我看到的最小批量是32,所以在尝试使用批量培训之前,请考虑获取更多数据。 Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM