简体   繁体   中英

Store TF model weights in CPU?

I am trying to adapt this CNN built for MNIST for CIFAR10. I've modified the structure to accomodate the size of CIFAR images (32x32). The model trains on my GPU but whenever I run the evaluation function for test data ( test_eval ), I get an out of memory error for the first convolution ( hconv1 )-

 ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,32,32,32]
     [[Node: Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, W_conv1/read/_79)]]

Now I tried running the same code on my CPU (by using export CUDA_VISIBLE_DEVICES="" ) and the test_eval function runs successfully. So I thought maybe if I explicitly store only my weight variables on my CPU it might work. So I modified the weight_variable function from -

def weight_variable(self, shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

to

def weight_variable(self, shape):
    with tf.device('/cpu:0'):
        initial = tf.Variable(tf.truncated_normal(shape, stddev=0.1))            
    return initial

Yet I am still getting an OOM error when I run the test_eval function. Does the above code actually store the variable in CPU? How do I solve this?

My GPU is GeForce GTX 950M, 2GB. Also, the code for MNIST CNN that I posted above was adapted from this

Memory is usually taken up by activations. For instance your first conv layer has 3.2K in weights, but 32x28x28x10000x4 = 1.003GB in activations. This is followed by broadcasting add, so you need more than 2GB RAM to go through first layer. See this for an example of profiling memory

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM