简体   繁体   English

Keras fit_generator() - 时间序列的批处理如何工作?

[英]Keras fit_generator() - How does batch for time series work?

Context: 语境:

I am currently working on time series prediction using Keras with Tensorflow backend and, therefore, studied the tutorial provided here . 我目前正在使用带有Tensorflow后端的Keras进行时间序列预测,因此研究了这里提供的教程。

Following this tutorial, I came to the point where the generator for the fit_generator() method is described. 在本教程之后,我开始讨论fit_generator()方法的生成器。 The output this generator generates is as follows (left sample, right target): 此生成器生成的输出如下(左侧样本,右侧目标):

[[[10. 15.]
  [20. 25.]]] => [[30. 35.]]     -> Batch no. 1: 2 Samples | 1 Target
  ---------------------------------------------
[[[20. 25.]
  [30. 35.]]] => [[40. 45.]]     -> Batch no. 2: 2 Samples | 1 Target
  ---------------------------------------------
[[[30. 35.]
  [40. 45.]]] => [[50. 55.]]     -> Batch no. 3: 2 Samples | 1 Target
  ---------------------------------------------
[[[40. 45.]
  [50. 55.]]] => [[60. 65.]]     -> Batch no. 4: 2 Samples | 1 Target
  ---------------------------------------------
[[[50. 55.]
  [60. 65.]]] => [[70. 75.]]     -> Batch no. 5: 2 Samples | 1 Target
  ---------------------------------------------
[[[60. 65.]
  [70. 75.]]] => [[80. 85.]]     -> Batch no. 6: 2 Samples | 1 Target
  ---------------------------------------------
[[[70. 75.]
  [80. 85.]]] => [[90. 95.]]     -> Batch no. 7: 2 Samples | 1 Target
  ---------------------------------------------
[[[80. 85.]
  [90. 95.]]] => [[100. 105.]]   -> Batch no. 8: 2 Samples | 1 Target

In the tutorial the TimeSeriesGenerator was used, but for my question it is secondary if a custom generator or this class is used. 在本教程中使用了TimeSeriesGenerator ,但对于我的问题,如果使用自定义生成器或此类,则它是次要的。 Regarding the data, we have 8 steps_per_epoch and a sample of shape (8, 1, 2, 2). 关于数据,我们有8个steps_per_epoch和一个形状样本(8,1,2,2)。 The generator is fed to a Recurrent Neural Network, implemented by an LSTM. 发电机被馈送到由LSTM实现的递归神经网络。

My questions 我的问题

fit_generator() only allows a single target per batch, as outputted by the TimeSeriesGenerator . fit_generator()仅允许每个批处理的单个目标,由TimeSeriesGenerator输出。 When I first read about the option of batches for fit(), I thought that I could have multiple samples and a corresponding number of targets (which are processed batchwise, meaning row by row). 当我第一次阅读fit()批量选项时,我认为我可以有多个样本和相应数量的目标(分批处理,逐行处理)。 But this is not allowed by fit_generator() and, therefore, obviously false. 但是fit_generator()不允许这样fit_generator() ,因此显然是错误的。 This would look for example like: 这将是例如:

[[[10. 15. 20. 25.]]] => [[30. 35.]]     
[[[20. 25. 30. 35.]]] => [[40. 45.]]    
    |-> Batch no. 1: 2 Samples | 2 Targets
  ---------------------------------------------
[[[30. 35. 40. 45.]]] => [[50. 55.]]    
[[[40. 45. 50. 55.]]] => [[60. 65.]]    
    |-> Batch no. 2: 2 Samples | 2 Targets
  ---------------------------------------------
...

Secondly, I thought that, for example, [10, 15] and [20, 25] were used as input for the RNN consecutively for the target [30, 35], meaning that this is analog to inputting [10, 15, 20, 25]. 其次,我认为,例如,[10,15]和[20,25]被连续用作RNN的输入[30,35],这意味着这类似于输入[10,15,20] ,25]。 Since the output from the RNN differs using the second approach (I tested it), this also has to be a wrong conclusion. 由于RNN的输出使用第二种方法(我测试过)不同,这也是一个错误的结论。

Hence, my questions are: 因此,我的问题是:

  1. Why is only one target per batch allowed (I know there are some workarounds, but there has to be a reason)? 为什么每批只允许一个目标(我知道有一些解决方法,但必须有一个原因)?
  2. How may I understand the calculation of one batch? 我如何理解一批的计算? Meaning, how is some input like [[[40, 45], [50, 55]]] => [[60, 65]] processed and why is it not analog to [[[40, 45, 50, 55]]] => [[60, 65]] 意思是,如何处理像[[[40, 45], [50, 55]]] => [[60, 65]] ,为什么它与[[[40, 45, 50, 55]]] => [[60, 65]]



Edit according to todays answer 根据今天的答案编辑
Since there is some misunderstanding about my definition of samples and targets - I follow what I understand Keras is trying to tell me when saying: 由于对我对样本和目标的定义存在一些误解 - 我遵循了我所理解的Keras在说:

ValueError: Input arrays should have the same number of samples as target arrays. ValueError:输入数组应具有与目标数组相同的样本数。 Found 1 input samples and 2 target samples. 找到1个输入样本和2个目标样本。

This error occurs, when I create for example a batch which looks like: 当我创建一个看起来像这样的批处理时,会发生此错误:

#This is just a single batch - Multiple batches would be fed to fit_generator()
(array([[[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]]]), 
                           array([[ 5,  6,  7,  8,  9],
                           [10, 11, 12, 13, 14]]))

This is supposed to be a single batch containing two time-sequences of length 5 (5 consecutive data points / time-steps), whose targets are also two corresponding sequences. 这应该是包含两个长度为5的时间序列(5个连续数据点/时间步长)的单个批次,其目标也是两个相应的序列。 [ 5, 6, 7, 8, 9] is the target of [0, 1, 2, 3, 4] and [10, 11, 12, 13, 14] is the corresponding target of [5, 6, 7, 8, 9] . [ 5, 6, 7, 8, 9][0, 1, 2, 3, 4] [ 5, 6, 7, 8, 9]的目标, [0, 1, 2, 3, 4] [10, 11, 12, 13, 14][5, 6, 7, 8, 9]的对应目标[10, 11, 12, 13, 14] [5, 6, 7, 8, 9]
The sample-shape in this would be shape(number_of_batches, number_of_elements_per_batch, sequence_size) and the target-shape shape(number_of_elements_per_batch, sequence_size) . 其中的样本形状将是shape(number_of_batches, number_of_elements_per_batch, sequence_size)和目标形状shape(number_of_elements_per_batch, sequence_size)
Keras sees 2 target samples (in the ValueError), because I have two provide 3D-samples as input and 2D-targets as output (maybe I just don't get how to provide 3D-targets..). Keras看到2个目标样本(在ValueError中),因为我有两个提供3D样本作为输入,2D目标作为输出(可能我只是不知道如何提供3D目标..)。

Anyhow, according to @todays answer/comments, this is interpreted as two timesteps and five features by Keras. 无论如何,根据@todays的回答/评论,这被Keras解释为两个时间步和五个特征。 Regarding my first question (where I still see a sequence as target to my sequence, as in this edit-example), I seek information how/if I can achieve this and how such a batch would look like (like I tried to visualize in the question). 关于我的第一个问题(我仍然看到一个序列作为我的序列的目标,如在这个编辑示例中),我寻求信息如何/如果我可以实现这个以及如何这样的批处理看起来像(就像我试图在问题)。

Short answers: 简短的答案:

Why is only one target per batch allowed (I know there are some workarounds, but there has to be a reason)? 为什么每批只允许一个目标(我知道有一些解决方法,但必须有一个原因)?

That's not the case at all. 事实并非如此。 There is no restriction on the number of target samples in a batch. 批次中的目标样本数量没有限制。 The only requirement is that you should have the same number of input and target samples in each batch. 唯一的要求是每批中您应该具有相同数量的输入和目标样本。 Read the long answer for further clarification. 阅读详尽的答案以进一步澄清。

How may I understand the calculation of one batch? 我如何理解一批的计算? Meaning, how is some input like [[[40, 45], [50, 55]]] => [[60, 65]] processed and why is it not analog to [[[40, 45, 50, 55]]] => [[60, 65]] ? 意思是,如何处理像[[[40, 45], [50, 55]]] => [[60, 65]] ,为什么它与[[[40, 45, 50, 55]]] => [[60, 65]]

The first one is a multi-variate timeseries (ie each timestep has more than one features), and the second one is a uni-variate timeseris (ie each timestep has one feature). 第一个是多变量时间序列(即每个时间步长具有多个特征),第二个是单变量时间序列(即每个时间步长具有一个特征)。 So they are not equivalent. 所以他们不等同。 Read the long answer for further clarification. 阅读详尽的答案以进一步澄清。

Long answer: 答案很长:

I'll give the answer I mentioned in comments section and try to elaborate on it using examples: 我将在评论部分给出答案,并尝试使用示例详细说明:

I think you are mixing samples, timesteps, features and targets. 我想你正在混合样本,时间步长,功能和目标。 Let me describe how I understand it: in the first example you provided, it seems that each input sample consists of 2 timesteps, eg [10, 15] and [20, 25] , where each timestep consists of two features, eg 10 and 15 or 20 and 25. Further, the corresponding target consists of one timestep, eg [30, 35] , which also has two features. 让我来描述我是如何理解的:在你提供的第一个例子中,似乎每个输入样本由2个时间步长组成,例如[10, 15][20, 25] 20,25 [20, 25] ,其中每个时间步长由两个特征组成,例如10和此外,相应的目标由一个时间步长组成,例如[30, 35] 30,35 [30, 35] ,其也具有两个特征。 In other words, each input sample in a batch must have a corresponding target. 换句话说,批处理中的每个输入样本必须具有相应的目标。 However, the shape of each input sample and its corresponding target may not be necessarily the same. 然而,每个输入样本的形状及其对应的目标可能不一定相同。

For example, consider a model where both its input and output are timeseries. 例如,考虑一个模型,其输入和输出都是时间序列。 If we denote the shape of each input sample as (input_num_timesteps, input_num_features) and the shape of each target (ie output) array as (output_num_timesteps, output_num_features) , we would have the following cases: 如果我们将每个输入样本的形状表示为(input_num_timesteps, input_num_features)并将每个目标(即输出)数组的形状表示为(output_num_timesteps, output_num_features) ,我们将具有以下情况:

1) The number of input and output timesteps are the same (ie input_num_timesteps == output_num_timesteps ). 1)输入和输出时间步长的数量是相同的(即input_num_timesteps == output_num_timesteps )。 Just as an example, the following model could achieve this: 仅作为示例,以下模型可以实现此目的:

from keras import layers
from keras import models

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(..., return_sequences=True)(x)

# a final RNN layer that has `output_num_features` unit
out = layers.LSTM(output_num_features, return_sequneces=True)(x)

model = models.Model(inp, out)

2) The number of input and output timesteps are different (ie input_num_timesteps ~= output_num_timesteps ). 2)输入和输出时间步数不同(即input_num_timesteps ~= output_num_timesteps )。 This is usually achieved by first encoding the input timeseries into a vector using a stack of one or more LSTM layers, and then repeating that vector output_num_timesteps times to get a timeseries of desired length. 这通常通过首先使用一个或多个LSTM层的堆栈将输入时间序列编码成矢量,然后重复该矢量output_num_timesteps次以获得期望长度的时间序列来实现。 For the repeat operation, we can easily use RepeatVector layer in Keras. 对于重复操作,我们可以在RepeatVector中轻松使用RepeatVector图层。 Again, just as an example, the following model could achieve this: 再次,作为示例,以下模型可以实现此目的:

from keras import layers
from keras import models

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(...)(x)  # The last layer ONLY returns the last output of RNN (i.e. return_sequences=False)

# repeat `x` as needed (i.e. as the number of timesteps in output timseries)
x = layers.RepeatVector(output_num_timesteps)(x)

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(x)
# ...
out = layers.LSTM(output_num_features, return_sequneces=True)(x)

model = models.Model(inp, out)

As a special case, if the number of output timesteps is 1 (eg the network is trying to predict the next timestep given the last t timesteps), we may not need to use repeat and instead we can just use a Dense layer (in this case the output shape of the model would be (None, output_num_features) , and not (None, 1, output_num_features) ): 作为一种特殊情况,如果输出时间步长的数量为1(例如,网络试图预测给定最后t时间步的下一个时间步长),我们可能不需要使用重复,而是我们可以只使用Dense层(在此case的模型的输出形状是(None, output_num_features) ,而不是(None, 1, output_num_features) ):

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(...)(x)  # The last layer ONLY returns the last output of RNN (i.e. return_sequences=False)

out = layers.Dense(output_num_features, activation=...)(x)

model = models.Model(inp, out)

Note that the architectures provided above are just for illustration, and you may need to tune or adapt them, eg by adding more layers such as Dense layer, based on your use case and the problem you are trying to solve. 请注意,上面提供的体系结构仅用于说明,您可能需要调整或调整它们,例如,根据您的用例和您尝试解决的问题添加更多层,例如Dense层。


Update: The problem is that you don't pay enough attention when reading, both my comments and answer as well as the error raised by Keras. 更新:问题在于您在阅读时没有给予足够的重视,包括我的评论和答案以及Keras引发的错误。 The error clearly states that: 该错误明确指出:

... Found 1 input samples and 2 target samples. ...找到1个输入样本和2个目标样本。

So, after reading this carefully, if I were you I would say to myself: "OK, Keras thinks that the input batch has 1 input sample, but I think I am providing two samples!! Since I am a very good person(!), I think it's very likely that I would be wrong than Keras, so let's find out what I am doing wrong!". 因此,仔细阅读后,如果我是你,我会对自己说:“好吧,Keras认为输入批次有1个输入样本,但我想我提供两个样本!!因为我是一个非常好的人(! ),我认为我很可能比凯拉斯错了,所以让我们找出我做错了什么!“ A simple and quick check would be to just examine the shape of input array: 一个简单快速的检查就是检查输入数组的形状:

>>> np.array([[[0, 1, 2, 3, 4],
               [5, 6, 7, 8, 9]]]).shape
(1,2,5)

"Oh, it says (1,2,5) ! So that means one sample which has two timesteps and each timestep has five features!!! So I was wrong into thinking that this array consists of two samples of length 5 where each timestep is of length 1!! So what should I do now???" “哦,它说(1,2,5) !所以这意味着一个样本有两个时间步,每个时间步有五个特征!所以我认为这个数组由两个长度为5的样本组成,每个时间步长是长度1 !!所以我现在该怎么办?“ Well, you can fix it, step-by-step: 好吧,你可以一步一步地解决它:

# step 1: I want a numpy array
s1 = np.array([])

# step 2: I want it to have two samples
s2 = np.array([
               [],
               []
              ])

# step 3: I want each sample to have 5 timesteps of length 1 in them
s3 = np.array([
               [
                [0], [1], [2], [3], [4]
               ],
               [
                [5], [6], [7], [8], [9]
               ]
              ])

>>> s3.shape
(2, 5, 1)

Voila! 瞧! We did it! 我们做到了! This was the input array; 这是输入数组; now check the target array, it must have two target samples of length 5 each with one feature, ie having a shape of (2, 5, 1) : 现在检查目标数组,它必须有两个长度为5的目标样本,每个样本有一个特征,即形状为(2, 5, 1)

>>> np.array([[ 5,  6,  7,  8,  9],
              [10, 11, 12, 13, 14]]).shape
(2,5)

Almost! 几乎! The last dimension (ie 1 ) is missing ( NOTE: depending on the architecture of your model you may or may not need that last axis). 缺少最后一个维度(即1 )( 注意:根据模型的架构,您可能需要或可能不需要最后一个轴)。 So we can use the step-by-step approach above to find our mistake, or alternatively we can be a bit clever and just add an axis to the end: 所以我们可以使用上面的分步方法来找出我们的错误,或者我们可以有点聪明,只需在最后添加一个轴:

>>> t = np.array([[ 5,  6,  7,  8,  9],
                  [10, 11, 12, 13, 14]])
>>> t = np.expand_dims(t, axis=-1)
>>> t.shape
(2, 5, 1)

Sorry, I can't explain it better than this! 对不起,我无法解释它比这更好! But in any case, when you see that something (ie shape of input/target arrays) is repeated over and over in my comments and my answer, assume that it must be something important and should be checked. 但无论如何,当你看到我的评论和答案中反复重复某些事情(即输入/目标数组的形状)时,假设它必须是重要的东西,应该检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM