簡體 English 中英

Keras Tensorfolow 中的 BatchNormalization 層中的屬性“可訓練”和“訓練”有什么區別？

[英]What's the difference between attrubutes 'trainable' and 'training' in BatchNormalization layer in Keras Tensorfolow?

原文 2020-07-04 13:34:44 5 1 python/ tensorflow/ keras/ tf.keras/ batch-normalization

根據來自tensorflow的官方文檔：

關於在 `BatchNormalization 層上設置 layer.trainable = False：
設置 layer.trainable = False 的意思是凍結層，即它的內部 state 在訓練期間不會改變：它的可訓練權重在 fit() 或 train_on_batch() 期間不會更新，它的 state 更新不會運行。
通常，這並不一定意味着該層在推理模式下運行（通常由調用層時可以傳遞的訓練參數控制）。 “凍結狀態”和“推理模式”是兩個獨立的概念。
但是，在 BatchNormalization 層的情況下，在該層上設置 trainable = False 意味着該層隨后將在推理模式下運行（意味着它將使用移動均值和移動方差來歸一化當前批次，而不是使用當前批次的均值和方差）。
此行為已在 TensorFlow 2.0 中引入，以啟用 layer.trainable = False 以在 convnet 微調用例中產生最常見的預期行為。

我不太明白概念中的“凍結狀態”和“推理模式”一詞。 我嘗試通過將trainable設置為 False 進行微調，我發現移動均值和移動方差沒有更新。

所以我有以下問題：

2 屬性訓練和可訓練有什么區別？
如果將 trainable 設置為 false，gamma 和 beta 是否會在訓練過程中得到更新？
為什么微調的時候需要設置trainable為false？

1 個解決方案

What's the difference between 2 attributes training and trainable?

可訓練：-（如果為真）它基本上意味着參數（層）的“可訓練”權重將在反向傳播中更新。

訓練：-一些層在訓練和推理（或測試）步驟中的表現不同。 一些示例包括 Dropout 層、Batch-Normalization 層。 所以這個屬性告訴層它應該以什么方式執行。

Is gamma and beta getting updated in the training process if set trainable to false?

由於 gamma 和 beta 是 BN 層的“可訓練”參數，如果 set trainable 設置為“False”，它們將不會在訓練過程中更新。

Why is it necessary to set trainable to false when fine-tuning?

在進行微調時，我們首先在頂部添加我們自己的分類 FC 層，該層是隨機初始化的，但我們的“預訓練”model 已經針對該任務進行了校准（有點）。

打個比方，這樣想。

你有一個從 0 到 10 的數軸。在這個數軸上，“0”代表完全隨機的 model，而“10”代表一種完美的 model。 我們預訓練的 model 大約為 5 或 6 或 7 左右，即很可能比隨機的 model 更好。 我們在頂部添加的 FC 層位於“0”，因為它在開始時是隨機的。

我們為預訓練的 model 設置 trainable = False，以便我們可以使 FC 層快速達到預訓練的 model 的水平，即具有更高的學習率。 如果我們不為預訓練的 model 設置 trainable = False 並使用更高的學習率，那么它將造成嚴重破壞。

因此，最初，我們為預訓練的 model 設置更高的學習率和 trainable = False 並訓練 FC 層。 之后，我們解凍我們預訓練的 model 並使用非常低的學習率來達到我們的目的。

如果需要，請自由要求更多說明，如果您覺得有幫助，請點贊。

設置Keras模型可訓練與使每層可訓練之間有什么區別

[英]What is the difference between setting a Keras model trainable vs making each layer trainable

Keras 的 BatchNormalization 和 PyTorch 的 BatchNorm2d 的區別？

[英]Difference between Keras' BatchNormalization and PyTorch's BatchNorm2d?

任何keras層中的dropout層和dropout參數有什么區別

[英]What is the difference between dropout layer and dropout parameter in any keras layer

在tensorflow中，可訓練和停止梯度之間的區別是什么

[英]In tensorflow what is the difference between trainable and stop gradient

具有可訓練標量的自定義 Keras 層

[英]Custom Keras Layer with Trainable Scalars

Keras BatchNormalization 層：InternalError：cuDNN 啟動失敗

[英]Keras BatchNormalization layer : InternalError: cuDNN launch failure

keras.Model 和 keras.engine.training.Model 有什么區別？

[英]What is difference between keras.Model and keras.engine.training.Model?

使用 softmax 作為 tf.keras 中的連續層和使用 softmax 作為密集層的激活函數有什么區別？

[英]what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?

Keras Dense 層和 Pytorch 的 nn.linear 層有區別嗎？

[英]Is there a difference between Keras Dense layer and Pytorch's nn.linear layer?

在Keras有可能有不可訓練的層嗎？

[英]Is it possible to have non-trainable layer in Keras?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 設置Keras模型可訓練與使每層可訓練之間有什么區別 Keras 的 BatchNormalization 和 PyTorch 的 BatchNorm2d 的區別？任何keras層中的dropout層和dropout參數有什么區別在tensorflow中，可訓練和停止梯度之間的區別是什么具有可訓練標量的自定義 Keras 層 Keras BatchNormalization 層：InternalError：cuDNN 啟動失敗 keras.Model 和 keras.engine.training.Model 有什么區別？使用 softmax 作為 tf.keras 中的連續層和使用 softmax 作為密集層的激活函數有什么區別？ Keras Dense 層和 Pytorch 的 nn.linear 層有區別嗎？在Keras有可能有不可訓練的層嗎？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM