簡體 English 中英

在Tensorflow上訓練多GPU：更簡單的方法？

[英]Training Multi-GPU on Tensorflow: a simpler way?

原文 2016-12-07 23:10:56 5 1 machine-learning/ tensorflow/ gpu

我一直在使用cifar10_multi_gpu_train示例中提出的訓練方法進行（本地）多gpu訓練，即創建多個塔然后平均梯度。 但是，我想知道以下幾點：如果我只接受來自不同GPU的損失，將其求和，然后對新損失應用梯度下降，會發生什么情況。

那行得通嗎？ 可能這是一個愚蠢的問題，並且在某處必須有一個限制。 因此，如果您可以對此發表評論，我將非常高興。

謝謝您，G.

1 個解決方案

總和不起作用。 您將獲得更大的損失，從而導致更大的梯度並且可能是錯誤的梯度。 在對梯度求平均時，您將獲得權重所采用的方向的平均值，以最大程度地減少損失，但是每個方向都是針對確切損失值計算的。

您可以嘗試的一件事是獨立運行塔，然后不時平均權重，收斂速度較慢，但每個節點的處理速度更快。

TensorFlow：是否可以為多GPU訓練恢復檢查點模型？

[英]TensorFlow: Is it possible to restore checkpoint models for multi-gpu training?

具有多GPU方法的tensorflow分布式訓練混合

[英]tensorflow distributed training hybrid with multi-GPU methodology

TensorFlow：多 GPU 配置（性能）

[英]TensorFlow: Multi-GPU configuration (performance)

多gpu模式下的tensorflow c ++ SetDefaultDevice

[英]tensorflow c++ SetDefaultDevice in multi-gpu mode

自定義模型的 Keras 多 GPU 模型失敗

[英]Keras multi-gpu model fails for a custom model

Tensorflow 訓練錯誤 model（在 GPU 上）

[英]Tensorflow error upon training model (on GPU)

在Keras中使用multi_gpu_model恢復培訓

[英]Resume training with multi_gpu_model in Keras

針對具有不同 GPU 的變形金剛的多 GPU 訓練

[英]Multi GPU training for Transformers with different GPUs

多GPU似乎在TensorFlow1.0上不起作用

[英]Multi GPU seems not work on TensorFlow1.0

在前提下分發 Tensorflow 培訓的最簡單方法？

[英]Simplest way to distribute Tensorflow training on premise?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 TensorFlow：是否可以為多GPU訓練恢復檢查點模型？具有多GPU方法的tensorflow分布式訓練混合 TensorFlow：多 GPU 配置（性能）多gpu模式下的tensorflow c ++ SetDefaultDevice 自定義模型的 Keras 多 GPU 模型失敗 Tensorflow 訓練錯誤 model（在 GPU 上）在Keras中使用multi_gpu_model恢復培訓針對具有不同 GPU 的變形金剛的多 GPU 訓練多GPU似乎在TensorFlow1.0上不起作用在前提下分發 Tensorflow 培訓的最簡單方法？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM