GPU上的Tensorflow Matmul計算比CPU慢

Question

我是第一次嘗試GPU計算，當然希望能夠大幅度提高速度。 但是，使用tensorflow中的一個基本示例實際上更糟：

在cpu：0上，十次運行中的每一次平均花費2秒，gpu：0花費2.7秒，而gpu：1比cpu：0差3秒，即降低50％。

這是代碼：

import tensorflow as tf
import numpy as np
import time
import random

for _ in range(10):
    with tf.Session() as sess:
        start = time.time()
        with tf.device('/gpu:0'): # swap for 'cpu:0' or whatever
            a = tf.constant([random.random() for _ in xrange(1000 *1000)], shape=[1000, 1000], name='a')
            b = tf.constant([random.random() for _ in xrange(1000 *1000)], shape=[1000, 1000], name='b')
            c = tf.matmul(a, b)
            d = tf.matmul(a, c)
            e = tf.matmul(a, d)
            f = tf.matmul(a, e)
            for _ in range(1000):
                sess.run(f)
        end = time.time()
        print(end - start)

我在這里觀察什么？ 運行時間是否主要由在RAM和GPU之間復制數據主導？

Answer 1

用於生成數據的方式在CPU上執行（ random.random()是常規的python函數，而不是TF-one）。 同樣，執行10^6次比一次運行請求10^6隨機數要慢。 將代碼更改為：

a = tf.random_uniform([1000, 1000], name='a')
b = tf.random_uniform([1000, 1000], name='b')

因此，數據將在GPU上並行生成，而不會浪費時間將其從RAM傳輸到GPU。

GPU上的Tensorflow Matmul計算比CPU慢

問題描述

1 個解決方案

解決方案1
4 已采納 2016-11-22 10:08:07

GPU上的Tensorflow Matmul計算比CPU慢

問題描述

1 個解決方案

解決方案1 4 已采納 2016-11-22 10:08:07

解決方案1
4 已采納 2016-11-22 10:08:07