Mixture usage of CPU and GPU in Keras

Question

I am building a neural network on Keras, including multiple layers of LSTM, Permute and Dense.

It seems LSTM is GPU-unfriendly. So I did research and use

With tf.device('/cpu:0'):
   out = LSTM(cells)(inp)

But based on my understanding about with , with is try...finally block to ensure that clean-up code is executed. I don't know whether the following CPU/GPU mixture usage code works or not? Will they accelerate speed of training?

With tf.device('/cpu:0'):
  out = LSTM(cells)(inp)
With tf.device('/gpu:0'):
  out = Permute(some_shape)(out)
With tf.device('/cpu:0'):
  out = LSTM(cells)(out)
With tf.device('/gpu:0'):
  out = Dense(output_size)(out)

Answer 1

As you may read here - tf.device is a context manager which switches a default device to this passed as its argument in a context (block) created by it. So this code should run all '/cpu:0' device at CPU and rest on GPU .

The question will it speed up your training is really hard to answer because it depends on the machine you use - but I don't expect computations to be faster as each change of a device makes data to be copied between GPU RAM and machine RAM . This could even slow down your computations.

Answer 2

I have created a model using 2 LSTM and 1 dense layers and trained it in my GPU (NVidia GTX 10150Ti) Here is my observations.

use CUDA LSTM https://keras.io/layers/recurrent/#cudnnlstm
Use a bath size which helps more GPU parallelism, if I use a very small batch size(2-10) GPU multi cores are not utilized; so I used 100 as batch size
If I train my network on GPU and try to use it for predictions on CPU, it works in-terms of compiling and running but the predictions are weird. In my case I have the luxury to use a GPU for prediction as well.
for multi layer LSTM, need to use

here is some sample snippet

model = keras.Sequential()
model.add(keras.layers.cudnn_recurrent.CuDNNLSTM(neurons
                , batch_input_shape=(nbatch_size, reshapedX.shape[1], reshapedX.shape[2])
                , return_sequences=True
                , stateful=True))

Answer 3

TojoHere's answer one needs to be upvoted! This trick made my LSTM training almost 10 times faster. Thanks a lot!

Mixture usage of CPU and GPU in Keras

Question

2 answers

solution1
1 ACCPTED 2017-09-22 19:03:25

solution2
0 2017-12-21 05:23:05

solution3
-1 2019-10-18 08:35:04

Mixture usage of CPU and GPU in Keras

Question

2 answers

solution1 1 ACCPTED 2017-09-22 19:03:25

solution2 0 2017-12-21 05:23:05

solution3 -1 2019-10-18 08:35:04

solution1
1 ACCPTED 2017-09-22 19:03:25

solution2
0 2017-12-21 05:23:05

solution3
-1 2019-10-18 08:35:04