[英]How to format time-series data with multiple categories and numerical data related to each category for input into a Tensorflow model?
我正在尝试使用 LSTM 模型为 Tensorflow 中的回归问题格式化数据。 数据是一个时间序列,其中每个时间戳的数据都有分类数据,其中包含与这些类别相关的数值数据和与这些类别无关的数值数据。
<style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top} </style> <table class="tg"> <thead> <tr> <th class="tg-0pky">Time</th> <th class="tg-0pky">Person</th> <th class="tg-0pky">Height</th> <th class="tg-0pky">Weight</th> <th class="tg-0pky">Location of Sun</th> <th class="tg-0pky">Location of Moon</th> </tr> </thead> <tbody> <tr> <td class="tg-0pky">00:00</td> <td class="tg-0pky">A</td> <td class="tg-0pky">1</td> <td class="tg-0pky">2</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky">B</td> <td class="tg-0pky">3</td> <td class="tg-0pky">7</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky">C</td> <td class="tg-0pky">3</td> <td class="tg-0pky">5</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky">4</td> <td class="tg-0pky">5</td> </tr> <tr> <td class="tg-0pky">00:05</td> <td class="tg-0pky">D</td> <td class="tg-0pky">1</td> <td class="tg-0pky">3</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky">A</td> <td class="tg-0pky">4</td> <td class="tg-0pky">5</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky">2</td> <td class="tg-0pky">3</td> </tr> <tr> <td class="tg-0pky">00:10</td> <td class="tg-0pky">B</td> <td class="tg-0pky">2</td> <td class="tg-0pky">3</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky">C</td> <td class="tg-0pky">4</td> <td class="tg-0pky">5</td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> </tr> <tr> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky"></td> <td class="tg-0pky">1</td> <td class="tg-0pky">3</td> </tr> </tbody> </table>
时间序列中可以出现固定数量的类别,但出现在特定时间戳的类别数量是可变的。 我正在考虑使用tf.keras.preprocessing.text.Tokenizer
来标记分类数据,但我不确定如何在数据结构的同时格式化数据以用作神经网络的输入。
我觉得我的问题可能不是 100% 清楚,所以如果你想让我澄清一些事情,请在下面评论
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.