简体   繁体   English

Tensorflow动态RNN(LSTM):如何格式化输入?

[英]Tensorflow dynamic RNN (LSTM): how to format input?

I have been given some data of this format and the following details: 我收到了一些这种格式的数据和以下细节:

person1, day1, feature1, feature2, ..., featureN, label
person1, day2, feature1, feature2, ..., featureN, label
...
person1, dayN, feature1, feature2, ..., featureN, label
person2, day1, feature1, feature2, ..., featureN, label
person2, day2, feature1, feature2, ..., featureN, label
...
person2, dayN, feature1, feature2, ..., featureN, label
...
  • there is always the same number of features but each feature might be a 0 representing nothing 总是有相同数量的功能,但每个功能可能是0表示什么都没有
  • there is a varying amount of days available for each person, eg person1 has 20 days of data, person2 has 50 每个人都有不同的天数,例如,person1有20天的数据,person2有50天

The goal is to predict the label of the person the following day, so the label for dayN+1, either on a per-person basis, or overall (per-person makes more sense to me). 目标是预测第二天的人的标签,因此dayN + 1的标签,无论是基于每个人,还是整体(每个人对我更有意义)。 I can freely reformat the data (it is not large). 我可以自由地重新格式化数据(它不是很大)。 Based on the above after some reading I thought a dynamic RNN (LSTM) could work best: 基于上面的一些阅读之后,我认为动态RNN(LSTM)可能效果最好:

  • recurrent neural network: because the next day relies on the previous day 反复神经网络:因为第二天依赖于前一天
  • lstm: because the model builds up with each day lstm:因为模型每天都会建立起来
  • dynamic: because not all features are present each day 动态:因为并非每天都有所有功能

If it does not make sense for the data I have, please stop me here. 如果对我的数据没有意义,请在这里阻止我。 The question is then: 问题是:

How to give/format this data for tensorflow/tflearn? 如何为tensorflow / tflearn提供/格式化这些数据?

I have looked at this example using tflearn but I do not understand its input format so that I can 'mirror' it to mine. 我使用tflearn查看了这个例子,但是我不理解它的输入格式,所以我可以“镜像”它到我的。 Similarly, have found this post on a very similar question yet it seems like the samples the poster has are not related between each-other as they are in mine. 同样地,在一个非常相似的问题上找到了这篇文章,但看起来海报所拥有的样本彼此之间没有相关性,因为它们在我的中。 My experience with tensorflow is limited to its get started page. 我对tensorflow的体验仅限于其入门页面。

dynamic: because not all features are present each day 动态:因为并非每天都有所有功能

You've got the wrong concept of dynamic here. 你在这里有错误的动态概念。 Dynamic RNN in Tensorflow means the graph is dynamically created during execution, but the inputs are always the same size (0 as the lack of a feature should work ok). Tensorflow中的动态RNN意味着图形是在执行期间动态创建的,但输入总是相同的大小(0表示缺少某个功能应该可以正常工作)。

Anyways, what you've got here are sequences of varying length (day1 ... day?) of feature vectors (feature1 ... featureN). 无论如何,你在这里得到的是特征向量(feature1 ... featureN)的不同长度(第1天......天?)的序列。 First, you need a LSTM cell 首先,您需要一个LSTM单元

cell = tf.contrib.rnn.LSTMcell(size)

so you can then create a dynamically unrolled rnn graph using tf.nn.dynamic_rnn . 这样您就可以使用tf.nn.dynamic_rnn创建动态展开的rnn图。 From the docs: 来自文档:

inputs: The RNN inputs. 输入:RNN输入。

If time_major == False (default), this must be a Tensor of shape: [batch_size, max_time, ...], or a nested tuple of such elements. 如果time_major == False(默认值),则必须是形状张量:[batch_size,max_time,...]或此类元素的嵌套元组。

where max_time refers to the input sequence length. 其中max_time指的是输入序列长度。 Because we're using dynamic_rnn, the sequence length doesn't need to be defined during compile time, so your input placeholder could be: 因为我们使用的是dynamic_rnn,所以在编译期间不需要定义序列长度,因此输入占位符可以是:

x = tf.placeholder(tf.float32, shape=(batch_size, None, N))

Which is then fed into the rnn like 然后将其送入rnn之类的

outputs, state = tf.nn.dynamic_rnn(cell, x)

Meaning your input data should have the shape (batch_size, seq_length, N) . 意味着您的输入数据应具有形状(batch_size, seq_length, N) If examples in one batch have varying length, you should pad them with 0-vectors to the max length and pass the appropriate sequence_length parameter to dynamic_rnn 如果一个批处理中的示例具有不同的长度,则应使用0向量填充它们到最大长度,并将适当的sequence_length参数传递给dynamic_rnn

Obviously I've skipped a lot of details, so to fully understand RNNs you should probably read one of the many excellent RNN tutorials, like this one for example. 显然我已经跳过很多细节,所以为了完全理解RNN,你应该阅读许多优秀的RNN教程之一,比如这个

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM