简体   繁体   English

training = False 实际上对 TensorFlow 迁移学习有什么作用?

[英]What does training = False actually do for Tensorflow Transfer Learning?

I have this code right here:我在这里有这段代码:

base_model = tf.keras.applications.resnet_v2.ResNet50V2(input_shape=input_shape, include_top=False, weights='imagenet')

base_model.trainable = False

inputs = tf.keras.Input(shape=input_shape)

x = data_augmentation(inputs)

x = tf.keras.applications.resnet_v2.preprocess_input(x)

x = base_model(x, training = False)

What does training = False actually do when we use it for base_model?当我们将它用于 base_model 时,training = False 实际上做了什么? I know that training is a boolean value to specify we want to run during training on inference mode, but following the Transfer Learning guide on Tensorflow, I can't figure out what it actually does.我知道训练是一个布尔值,用于指定我们要在推理模式训练期间运行,但是按照 Tensorflow 上的迁移学习指南,我无法弄清楚它实际上做了什么。

We set base_model.trainable = False, this means that the layers won't learn and we are just going to use what they learnt from imagenet.我们设置 base_model.trainable = False,这意味着层不会学习,我们将使用它们从 imagenet 中学到的东西。 But what does base_model(x, training = False) does?但是 base_model(x, training = False) 做了什么? I know that this won't run during training, does so when I am calling the fit() method, what is happening to base_model since training is set to False?我知道这不会在训练期间运行,当我调用 fit() 方法时,由于训练设置为 False,base_model 发生了什么?

I've read that it has something to do with Fine Tuning and batch norm layers but I am a bit lost.我读过它与微调和批处理规范层有关,但我有点迷茫。

Also should I use fine-tuning?我还应该使用微调吗? If I am planning not use it because the model is performing well anyway should I set trainining = True?如果我打算不使用它,因为模型无论如何都表现良好,我应该设置 training = True 吗? Or not set that value at all?还是根本不设置该值?

In general, that depends on your layers.一般来说,这取决于你的层。 For example, the dropout layer only sets values to 0, when training=True .例如,dropout 层仅在training=True时将值设置为 0。 Another example is the BatchNormalization layer, which works different during training and inference.另一个例子是 BatchNormalization 层,它在训练和推理过程中的工作方式不同。 For other layers, like the classical dense layer, it does not make a difference.对于其他层,例如经典的密集层,它没有任何区别。 If you really want to know all the details, you will have to read all the used layers and their specific behavior.如果您真的想了解所有细节,则必须阅读所有使用的图层及其特定行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM