简体   繁体   中英

Keras' clear_session() not working in Google colab

I run a keras model for several times in Google colab. Due to the nature of tensorflow there is a new model created each time of the program run, which leads to exhausted memory after some runs. I found that clear_session() of keras should help at the problem, but it doesn't seem to work. I created an MWE for Google colab below.

import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import backend as K

X = np.zeros([10, 10000])
y = np.zeros([10, 10000])

m = Sequential([Dense(10000, input_shape=(10000,)), Dense(10000), Dense(10000), Dense(10000)])


After running the part below ######## for three times, I get the following error:


ResourceExhaustedError                    Traceback (most recent call last)

<ipython-input-3-7ae5ab890fc2> in <module>
      3 m.summary()
----> 5 m.fit(X,y)
      6 K.clear_session()

1 frames

/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     53     ctx.ensure_initialized()
     54     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 55                                         inputs, attrs, num_outputs)
     56   except core._NotOkStatusException as e:
     57     if name is not None:

ResourceExhaustedError: Graph execution error:

Detected at node 'RMSprop/RMSprop/update_2/mul_2' defined at (most recent call last):
    File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
      "__main__", mod_spec)
    File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py", line 16, in <module>
    File "/usr/local/lib/python3.7/dist-packages/traitlets/config/application.py", line 846, in launch_instance
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelapp.py", line 612, in start
    File "/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py", line 132, in start
    File "/usr/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
    File "/usr/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
    File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
      self._context.run(self._callback, *self._args)
    File "/usr/local/lib/python3.7/dist-packages/tornado/ioloop.py", line 758, in _run_callback
      ret = callback()
    File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 1233, in inner
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 1147, in run
      yielded = self.gen.send(value)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 381, in dispatch_queue
      yield self.process_one()
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 346, in wrapper
      runner = Runner(result, future, yielded)
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 1080, in __init__
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 1147, in run
      yielded = self.gen.send(value)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 365, in process_one
      yield gen.maybe_future(dispatch(*args))
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 326, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
      yield gen.maybe_future(handler(stream, idents, msg))
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 326, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 545, in execute_request
      user_expressions, allow_stdin,
    File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 326, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/ipkernel.py", line 306, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "/usr/local/lib/python3.7/dist-packages/ipykernel/zmqshell.py", line 536, in run_cell
      return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2855, in run_cell
      raw_cell, store_history, silent, shell_futures)
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell
      return runner(coro)
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 3058, in run_cell_async
      interactivity=interactivity, compiler=compiler, result=result)
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
      if (await self.run_code(code, result,  async_=asy)):
    File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-3-7ae5ab890fc2>", line 5, in <module>
    File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1040, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1030, in run_step
      outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 893, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/usr/local/lib/python3.7/dist-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 539, in minimize
      return self.apply_gradients(grads_and_vars, name=name)
    File "/usr/local/lib/python3.7/dist-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 682, in apply_gradients
    File "/usr/local/lib/python3.7/dist-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 724, in _distributed_apply
      var, apply_grad_to_update_var, args=(grad,), group=False)
    File "/usr/local/lib/python3.7/dist-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 706, in apply_grad_to_update_var
      update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
    File "/usr/local/lib/python3.7/dist-packages/keras/optimizers/optimizer_v2/rmsprop.py", line 216, in _resource_apply_dense
      var_t = var - coefficients["lr_t"] * grad / (
Node: 'RMSprop/RMSprop/update_2/mul_2'
failed to allocate memory
     [[{{node RMSprop/RMSprop/update_2/mul_2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

I want to play around with slightly different data on the same model, so I run a similar part several times. I can simply restart the notebook after the error, but it takes some time to load the data, so is there an option how I can really clear an old model? Thanks for help.

Please restart the runtime and try again as I tried replicating the above code and it's working fine.

You can check the output mentioned below for the same code:

import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import backend as K

X = np.zeros([10, 10000])
y = np.zeros([10, 10000])

m = Sequential([Dense(10000, input_shape=(10000,)), Dense(10000), Dense(10000), Dense(10000)])

m.fit(X,y, epochs=2)


Model: "sequential"
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 10000)             100010000 
 dense_1 (Dense)             (None, 10000)             100010000 
 dense_2 (Dense)             (None, 10000)             100010000 
 dense_3 (Dense)             (None, 10000)             100010000 
Total params: 400,040,000
Trainable params: 400,040,000
Non-trainable params: 0
Epoch 1/2
1/1 [==============================] - 10s 10s/step - loss: 0.0000e+00
Epoch 2/2
1/1 [==============================] - 6s 6s/step - loss: 0.0000e+00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM