什么時候為tensorflow中的多GPU訓練設置重用= True？

Question

我正在嘗試使用帶有多個塔的張量流訓練網絡。 我為所有塔設置了reuse = True 。 但是在tensorflow教程的cifar10 multi gpu系列中，在創建第一個塔之后設置了重用變量：

with tf.variable_scope(tf.get_variable_scope()):
  for i in xrange(FLAGS.num_gpus):
    with tf.device('/gpu:%d' % i):
      with tf.name_scope('%s_%d' % (cifar10.TOWER_NAME, i)) as scope:
        # Dequeues one batch for the GPU
        image_batch, label_batch = batch_queue.dequeue()
        # Calculate the loss for one tower of the CIFAR model. This function
        # constructs the entire CIFAR model but shares the variables across
        # all towers.
        # Actually the logits (whole network) is defined in tower_loss
        loss = tower_loss(scope, image_batch, label_batch)

        # Reuse variables for the next tower.
        tf.get_variable_scope().reuse_variables()

有什么區別嗎？ 如果我們事先設置了reuse = True，會發生什么？

Answer 1

在第一次運行時，您需要reuse=False來生成變量。 如果reuse = True，但尚未構造變量，則會出現錯誤。

如果您使用新版本的tensorflow（我認為> 1.4），則可以使用reuse=tf.AUTO_REUSE ，它將為您帶來魔力。

我不確定它如何與您的多設備設置互動。 仔細檢查變量名是否不被設備作為前綴。 在這種情況下，沒有重用，每個設備都有一個不同的變量。

Answer 2

有兩種共享變量的方法。

任一版本1：

with tf.variable_scope("model"):
  output1 = my_image_filter(input1)
with tf.variable_scope("model", reuse=True):
  output2 = my_image_filter(input2)

或版本2：

with tf.variable_scope("model") as scope:
  output1 = my_image_filter(input1)
  scope.reuse_variables()
  output2 = my_image_filter(input2)

兩種方法共享變量。 Cifar10教程中使用了第二種方法，因為它更清潔（這只是我的觀點）。 您可以嘗試使用版本1重建它，該代碼可能可讀性較低。

什么時候為tensorflow中的多GPU訓練設置重用= True？

問題描述

2 個解決方案

解決方案1
2 2018-01-29 10:43:39

解決方案2
1 已采納 2018-01-29 13:10:57

什么時候為tensorflow中的多GPU訓練設置重用= True？

問題描述

2 個解決方案

解決方案1 2 2018-01-29 10:43:39

解決方案2 1 已采納 2018-01-29 13:10:57

解決方案1
2 2018-01-29 10:43:39

解決方案2
1 已采納 2018-01-29 13:10:57