当训练数据是图像时，Keras model.fit() 中的“批次”是什么

Question

say I have a set of image data for training, 20 input images and 20 output images, with image size 512*512.假设我有一组用于训练的图像数据，20 个输入图像和 20 个 output 图像，图像大小为 512*512。 Firstly I prepare training data as "train_image_input"(size 20*512*512) and "train_image_output"(size 20*512*512), then I run below code in Keras,首先，我将训练数据准备为“train_image_input”（大小 20*512*512）和“train_image_output”（大小 20*512*512），然后在 Keras 中运行以下代码，

model.fit(train_image_input, train_image_output,epochs=3,batch_size=5)

I would like to confirm the definition of a "batch" when data are images, on the above example, does "batch_size=5" means当数据是图像时，我想确认“批处理”的定义，在上面的示例中，“batch_size = 5”是否意味着

5 images(data size 5*512*512) are taken into training at a time?一次训练 5 张图像（数据大小 5*512*512）？
5 column among a single image(data size 5*512) are taken into training at a time?一次将单个图像（数据大小5 * 512）中的5列进行训练？

I had read the article: https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/ and the below description confuses me about the definition of sample/batch when data are images我读过这篇文章： https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/下面的描述让我对数据是图像时样本/批次的定义感到困惑

What Is a Sample?什么是样品？ A sample is a single row of data.样本是单行数据。 It contains inputs that are fed into the algorithm and an output that is used to compare to the prediction and calculate an error.它包含输入算法的输入和用于与预测进行比较并计算误差的 output。 A training dataset is comprised of many rows of data, eg many samples.训练数据集由多行数据组成，例如许多样本。 A sample may also be called an instance, an observation, an input vector, or a feature vector.样本也可以称为实例、观察、输入向量或特征向量。 Now that we know what a sample is, let's define a batch.现在我们知道了什么是样本，让我们定义一个批次。 What Is a Batch?什么是批次？ The batch size is a hyperparameter that defines the number of samples to work through before updating the internal model parameters.批量大小是一个超参数，用于定义在更新内部 model 参数之前要处理的样本数量。

Further more, if I set "batch_size=30" which is larger of number of images, there is no error during code execution, so I may consider the second one(data size 5*512) is correct?此外，如果我设置图像数量较大的“batch_size=30”，代码执行过程中没有错误，所以我可能认为第二个（数据大小5 * 512）是正确的？

Thanks.谢谢。

Answer 1

The batch size defines the number of samples that will be propagated through the network.批量大小定义将通过网络传播的样本数量。

For instance, let's say you have 1050 training samples and you want to set up a batch_size equal to 100. The algorithm takes the first 100 samples (from 1st to 100th) from the training dataset and trains the network.例如，假设您有 1050 个训练样本，并且您希望将 batch_size 设置为 100。该算法从训练数据集中获取前 100 个样本（从第 1 个到第 100 个）并训练网络。 Next, it takes the second 100 samples (from 101st to 200th) and trains the network again.接下来，它获取第二个 100 个样本（从第 101 个样本到第 200 个样本）并再次训练网络。 We can keep doing this procedure until we have propagated all samples through the network.我们可以继续执行此过程，直到我们通过网络传播了所有样本。 The problem might happen with the last set of samples.最后一组样本可能会出现问题。 In our example, we've used 1050 which is not divisible by 100 without the remainder.在我们的示例中，我们使用了 1050，它不能被 100 整除而没有余数。 The simplest solution is just to get the final 50 samples and train the network.最简单的解决方案就是获取最后的 50 个样本并训练网络。

当训练数据是图像时，Keras model.fit() 中的“批次”是什么

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-04-20 04:00:31

当训练数据是图像时，Keras model.fit() 中的“批次”是什么

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-04-20 04:00:31

解决方案1
0 已采纳 2020-04-20 04:00:31