简体   繁体   English

了解我的LSTM model的结构

[英]Understanding the structure of my LSTM model

I'm trying to solve the following problem:我正在尝试解决以下问题:

I have time series data from a number of devices.我有来自许多设备的时间序列数据。 each device recording is of length 3000. Every datapoint captured has 4 measurements.每个设备记录的长度为 3000。捕获的每个数据点都有 4 个测量值。 so my data is shaped (number of device recordings, 3000, 4).所以我的数据是成形的(设备记录的数量,3000、4)。

I'm trying produce a vector of length 3000 where each data point of is one of 3 labels (y1, y2, y3), so my desired output dim is (number of device recording, 3000, 1).我正在尝试生成一个长度为 3000 的向量,其中每个数据点是 3 个标签(y1、y2、y3)之一,所以我想要的 output 暗淡是(设备记录数,3000、1)。 I have labeled data for training.我已经标记了用于训练的数据。

I'm trying to use an LSTM model for this, as 'classification as I move along time series data' seems like a RNN type of problem.我正在尝试为此使用 LSTM model,因为“随着时间序列数据移动时的分类”似乎是 RNN 类型的问题。

I have my network set up like this:我的网络设置如下:

model = Sequential()
model.add(LSTM(3, input_shape=(3000, 4), return_sequences=True))
model.add(LSTM(3, activation = 'softmax', return_sequences=True))

model.summary()

and the summary looks like this:总结如下:

Model: "sequential_23"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_29 (LSTM)               (None, 3000, 3)           96        
_________________________________________________________________
lstm_30 (LSTM)               (None, 3000, 3)           84        
=================================================================
Total params: 180
Trainable params: 180
Non-trainable params: 0
_________________________________________________________________

All looks good and well in the output space, as I can use the result from each unit to determine which of my three categories belongs to that particular time step (I think).在 output 空间中,一切看起来都很好,因为我可以使用每个单元的结果来确定我的三个类别中的哪一个属于那个特定的时间步长(我认为)。

But I only have 180 trainable parameters, so I'm guessing that I am doing something horribly wrong.但我只有 180 个可训练参数,所以我猜我做错了什么。

Can someone help me understand why I have so few trainable parameters?有人可以帮我理解为什么我的可训练参数这么少吗? Am I misinterpreting how to set up this LSTM?我是否误解了如何设置这个 LSTM? Am I just worrying over nothing?我只是在担心什么吗?

Does that 3 units mean I only have 3 LSTM 'blocks'?那 3 个单位是否意味着我只有 3 个 LSTM“块”? and that it can only look back 3 observations?并且它只能回顾 3 个观察结果?

In a simplistic viewpoint , you can consider a LSTM layer as an augmented Dense layer with a memory (hence enabling efficient processing of sequences).从简单的角度来看,您可以将LSTM层视为具有 memory 的增强Dense层(因此可以有效地处理序列)。 So the concept of "units" is also the same for both: the number of neurons or feature units of these layers, or in other words, the number of distinctive features these layers can extract from the input.所以“单元”的概念对于两者来说也是相同的:这些层的神经元特征单元的数量,或者换句话说,这些层可以从输入中提取的独特特征的数量。

Therefore, when you specify the number of units to 3 for the LSTM layer, more or less it means that this layer can only extract 3 distinctive features from the input timesteps (note that the number of units has nothing to do with the length of input sequence, ie the entire input sequence will be processed by the LSTM layer no matter what the number of units or the length of input sequence is).因此,当你将LSTM层的单元数指定为 3 时,或多或少意味着该层只能从输入时间步长中提取 3 个不同的特征(注意,单元数与输入的长度无关序列,即无论单元的数量或输入序列的长度是多少,整个输入序列都将由LSTM层处理)。

Usually, this might be sub-optimal (though, it really depends on the difficulty of the specific problem and dataset you are working on; ie maybe 3 units might be enough for your problem/dataset, and you should experiment to find out).通常,这可能不是最理想的(不过,这实际上取决于您正在处理的特定问题和数据集的难度;即,对于您的问题/数据集,也许 3 个单位可能就足够了,您应该尝试找出答案)。 Therefore, often a higher number is chosen for the number of units (common choices: 32, 64, 128, 256), and also the classification task is delegated to a dedicated Dense layer (or sometimes called "softmax layer") at the top of the model.因此,通常会为单元数选择更高的数字(常见选择:32、64、128、256),并且分类任务也被委托给顶部的专用Dense层(或有时称为“softmax 层”) model 的。

For example, considering the description of your problem, a model with 3 stacked LSTM layers and a Dense classification layer at the top might look like this:例如,考虑到您的问题的描述,具有 3 个堆叠LSTM层和顶部Dense分类层的 model 可能如下所示:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(3000, 4)))
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(3, activation = 'softmax'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM