简体   繁体   中英

TensorFlow vs. Theano Performance

In the context of some neural network research I'm evaluating several approaches on how to implement these or what library to use. Currently I'm comparing Tensorflow and Theano and I'm struggling with getting TenorFlow to perform well. Here is my simple Hello-Gradient-Benchmark, it just optimizes a scalar multiplication with one coefficient.

import time

class Timer:

   def __init__(self, what):
      self.what = what

   def __enter__(self):
      self.t1 = time.time()
      return self

   def __exit__(self,t,v,tb):
      t2 = time.time()
      print("{0} runs {1:.4f} seconds".format(self.what, t2-self.t1))


def run_tensorflow():

   import tensorflow as tf

   x = tf.placeholder(tf.float32)
   y = tf.placeholder(tf.float32)
   a = tf.Variable([1.], tf.float32)

   sess = tf.Session()
   sess.run(tf.global_variables_initializer())

   loss = (y-a*x)**2
   step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

   def one_step():
      sess.run(step, {x:1.,y:0.})

   with Timer('tensorflow') as t:
      result = [ one_step() for n in range(1000) ]


def run_theano():

   import theano as th

   x = th.tensor.dscalar()
   y = th.tensor.dscalar()
   a = th.tensor.dscalar()
   l = a*x

   loss = (y-l)**2
   dloss = th.tensor.grad(loss, a)
   dloss_f = th.function([x,y,a], dloss)

   a = [1.]

   def one_step():
      a[0] -= 0.01 * dloss_f(1.,0.,a[0])

   with Timer('theano') as t:
      result = [ one_step() for n in range(1000) ]


run_tensorflow()
run_theano()

I'm running this program on the CPU with the packages installed via pip . Running times are 0.36 and 0.043 seconds for TensorFlow and Theano, respectively. I see similar performance differences for real networks where the matrix-multiplication overhead should dominate, still TensorFlow is significantly slower.

I want to know if I'm using Tensorflow wrongly for what I'm trying to do. Should I not call the run() method within a loop?

  1. TF and Theano are designed for handling large objects, on the order of 1M elements. Benchmarking their handling of scalars is not particularly relevant.

  2. This is an apples-to-oranges comparison: With TF, you are timing both the compilation and the run time, while in Theano, you are only timing the run time! This is because when you call theano.function , it does all the compilation then. OTOH in TF, much of this work is shifted to when you first call sess.run .

That said, there are also realistic scenarios when TF is slower than Theano.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM