简体繁体 English

model 或 model 之外的 Keras 数据增强层

[英]Keras data augmentation layers in model or out of model

原文 2022-08-24 09:08:25 4 1 python/ tensorflow/ keras/ data-augmentation

So this may be a silly question but how exactly do the preprocessing layers in keras work, especially in the context of as a part of the model itself.所以这可能是一个愚蠢的问题，但 keras 中的预处理层究竟是如何工作的，尤其是在作为 model 本身的一部分的情况下。 This being compared to preprocessing being applied outside the model then inputting the results for training.这与在 model 外部应用预处理然后输入结果进行训练进行比较。

I'm trying to understand running data augmentation in keras models.我试图了解在 keras 模型中运行数据增强。 Lets say I have 1000 images for training.假设我有 1000 张图像用于训练。 Out of model I can apply augmentation 10x and get 10000 resultant images for training.在 model 中，我可以应用 10 倍增强并获得 10000 个结果图像进行训练。

But I don't understand what's happening when you use a preprocess layer for augmentation.但是我不明白当您使用预处理层进行增强时会发生什么。 Does this (or these if you use many) layers take each image and apply the transformations before training?这个（或者如果你使用很多）层会在训练之前获取每个图像并应用转换吗？ Does this mean the total number of images used for training (and validation I assume) to be the number of epochs*the original number of images?这是否意味着用于训练（和我假设的验证）的图像总数是时期数*原始图像数？

Is one option better than the other?一种选择比另一种更好吗？ Does that depend on the number of images one originally has before augmentation?这是否取决于一个人在增强之前最初拥有的图像数量？

1 个解决方案

The benefit of preprocessing layers is that the model is truly end-to-end, ie raw data comes in and a prediction comes out.预处理层的好处是 model 是真正的端到端，即原始数据输入和预测输出。 It makes your model portable since the preprocessing procedure is included in the SavedModel.它使您的 model 可移植，因为预处理过程包含在 SavedModel 中。

However, it will run everything on the GPU.但是，它将在 GPU 上运行所有内容。 Usually it makes sense to load the data using CPU worker(s) in the background while the GPU optimizes the model.通常在后台使用 CPU 工作程序加载数据是有意义的，而 GPU 优化 model。

Alternatively, you could use a preprocessing layer outside of the model and inside a Dataset.或者，您可以在 model 之外和数据集中使用预处理层。 The benefit of that is that you can easily create an inference-only model including the layers, which then gives you the portability at inference time but still the speedup during training.这样做的好处是，您可以轻松地创建一个仅限推理的 model 包括层，然后在推理时为您提供可移植性，但在训练期间仍然可以加速。

For more information, see the Keras guide .有关详细信息，请参阅Keras 指南。