简体繁体 English

如何将 4D 张量输入 LSTM model？

[英]How to feed a 4D tensor into LSTM model?

原文 2021-12-10 04:25:35 3 2 python/ pytorch/ lstm/ recurrent-neural-network

I want to use an LSTM model to predict the future sales.我想使用 LSTM model 来预测未来的销售情况。

The data is like the table below.数据如下表。

date日期	store店铺	family家庭	sales销售量
01/01/2013 2013 年 1 月 1 日	1 1	AUTOMOTIVE汽车	0 0
01/01/2013 2013 年 1 月 1 日	1 1	BABY CARE婴儿护理	0 0
01/01/2013 2013 年 1 月 1 日	1 1	BEAUTY美丽	1 1
.. ..	. .	.. ..	. .
01/01/2013 2013 年 1 月 1 日	2 2	AUTOMOTIVE汽车	0 0
01/01/2013 2013 年 1 月 1 日	2 2	BABY CARE婴儿护理	0 0
.. ..	. .	.. ..	. .
01/01/2013 2013 年 1 月 1 日	50 50	AUTOMOTIVE汽车	0 0
.. ..	. .	.. ..	. .
01/02/2013 2013 年 1 月 2 日	1 1	AUTOMOTIVE汽车	0 0
01/02/2013 2013 年 1 月 2 日	1 1	BABY CARE婴儿护理	50 50
.. ..	. .	.. ..	. .
01/02/2013 2013 年 1 月 2 日	2 2	AUTOMOTIVE汽车	500 500
01/02/2013 2013 年 1 月 2 日	2 2	BABY CARE婴儿护理	0 0
.. ..	. .	.. ..	. .
01/02/2013 2013 年 1 月 2 日	50 50	AUTOMOTIVE汽车	0 0
.. ..	. .	.. ..	. .
.. ..	. .	.. ..	. .
12/31/2015 2015 年 12 月 31 日	1 1	AUTOMOTIVE汽车	0 0
12/31/2015 2015 年 12 月 31 日	1 1	BABY CARE婴儿护理	50 50
.. ..	. .	.. ..	. .
12/31/2015 2015 年 12 月 31 日	2 2	AUTOMOTIVE汽车	500 500
12/31/2015 2015 年 12 月 31 日	2 2	BABY CARE婴儿护理	0 0
.. ..	. .	.. ..	. .
12/31/2015 2015 年 12 月 31 日	50 50	AUTOMOTIVE汽车	0 0
.. ..	. .	.. ..	. .

For each day, it has 50 stores.每天，它有 50 家商店。
For each store, it has different type of family (product).对于每个商店，它都有不同类型的家庭（产品）。 (They are all in perfect order, thank God). （他们都井井有条，感谢上帝）。
Last, for each type of family, it has its sales.最后，对于每种类型的家庭，它都有自己的销售额。

Here is the problem.这是问题所在。

The dimension of input of LSTM model is (Batch_Size, Sequence_Length, Input_Dimension). LSTM model 的输入维度为 (Batch_Size, Sequence_Length, Input_Dimension)。 It is a 3D tensor.它是一个 3D 张量。

However, in my case, my Input_Dimension is 2D, which is ( rows x columns )但是，就我而言，我的 Input_Dimension 是 2D，即（行x列）
rows : number of rows in one day, which is 1782 rows : 一天的行数，即 1782
columns : number of features, which is 2 (store and family) columns ：特征数，即 2（商店和家庭）

Is there a good way to make my data into a shape which can be fed into a LSTM model?有没有一种好方法可以让我的数据变成可以输入 LSTM model 的形状？

Thanks a lot!非常感谢！

2 个解决方案

The concept is straightforward.这个概念很简单。 It works differently during learning and prediction however, so I'll explain the two cases separately.然而，它在学习和预测过程中的工作方式不同，所以我将分别解释这两种情况。

For training, you simply chop the part of your sequence that you chose as training data into smaller subsequences of time points.对于训练，您只需将您选择作为训练数据的序列部分切割成更小的时间点子序列。 (You need to do this because the network is "unrolled" for the duration of those subsequences and it cannot be too big; if there is no obvious choice, 32 or 64 timesteps should be a good starting point.) To maximize the use of your data, you also want your subsequences to have some overlap. （您需要这样做，因为网络在这些子序列的持续时间内是“展开”的，它不能太大；如果没有明显的选择，32 或 64 时间步应该是一个很好的起点。）为了最大限度地利用您的数据，您还希望您的子序列有一些重叠。 Both TF and PyTorch should have some helper functions that help you get this (such as tf.keras.utils.timeseries_dataset_from_array ). TF 和 PyTorch 都应该有一些帮助函数来帮助你获得这个（例如tf.keras.utils.timeseries_dataset_from_array ）。 You may find this article useful: https://mobiarch.wordpress.com/2020/11/13/preparing-time-series-data-for-rnn-in-tensorflow/您可能会发现这篇文章很有用： https://mobiarch.wordpress.com/2020/11/13/preparing-time-series-data-for-rnn-in-tensorflow/
For prediction, you can just feed your sequence by reshaping such that it is a 3D tensor with the first dimension of size 1. Either reshape or newaxis in NumPy can get you this.对于预测，您可以通过重塑来提供您的序列，使其成为 3D 张量，第一维大小为 1。 NumPy 中的reshape或newaxis以为您提供。 Of course you don't need to feed your entire sequence, only a sufficient number of timesteps back as context to get the RNN's state to converge to something appropriate.当然，您不需要提供整个序列，只需返回足够数量的时间步长作为上下文，以使 RNN 的 state 收敛到适当的值。

The solution I came up with is to make the whole data in each day to be a long long long sequence.我想出的解决办法是把每天的全部数据做成一个long long long的序列。 So the dimension will be 1D, and can be fed into the LSTM model.所以维度将是一维的，并且可以输入 LSTM model。

But I don't think this is the optimal solution.但我认为这不是最佳解决方案。

Does anyone come up with better answer?有没有人想出更好的答案？ Appreciate.欣赏。