[英]Python Garbage Collection sometimes not working in Jupyter Notebook
我的某些 Jupyter Notebook 的 RAM 不断耗尽,而且我似乎无法释放不再需要的内存。 下面是一个例子:
import gc
thing = Thing()
result = thing.do_something(...)
thing = None
gc.collect()
正如您可以假设的那样, thing
使用大量内存来做某事,然后我不再需要它了。 我应该能够释放它使用的内存。 即使它没有写入我可以从笔记本访问的任何变量,垃圾收集器也没有正确释放空间。 我发现的唯一解决方法是将result
写入泡菜,重新启动内核,从泡菜加载result
,然后继续。 这在运行长笔记本时真的很不方便。 如何正确释放内存?
这里有许多问题在起作用。 第一个是 IPython(当你看到像Out[67]
这样的东西时,Jupyter 在幕后使用的东西会保持对对象的额外引用。事实上,你可以使用这种语法来调用对象并用它做一些事情。例如str(Out[67])
. 第二个问题是 Jupyter 似乎保留了自己的输出变量引用,因此只有完全重置 IPython 才能工作。但这与仅重新启动笔记本没有太大区别。
不过有解决办法! 我写了一个你可以运行的函数,它会清除所有变量,除了你明确要求保留的变量。
def my_reset(*varnames):
"""
varnames are what you want to keep
"""
globals_ = globals()
to_save = {v: globals_[v] for v in varnames}
to_save['my_reset'] = my_reset # lets keep this function by default
del globals_
get_ipython().magic("reset")
globals().update(to_save)
你会像这样使用它:
x = 1
y = 2
my_reset('x')
assert 'y' not in globals()
assert x == 1
下面我写了一个笔记本,向您展示了一些幕后发生的事情,以及您如何使用weakref
模块查看何时真正删除了weakref
。 您可以尝试运行它,看看它是否可以帮助您了解正在发生的事情。
In [1]: class MyObject:
pass
In [2]: obj = MyObject()
In [3]: # now lets try deleting the object
# First, create a weak reference to obj, so we can know when it is truly deleted.
from weakref import ref
from sys import getrefcount
r = ref(obj)
print("the weak reference looks like", r)
print("it has a reference count of", getrefcount(r()))
# this prints a ref count of 2 (1 for obj and 1 because getrefcount
# had a reference to obj)
del obj
# since obj was the only strong reference to the object, it should have been
# garbage collected now.
print("the weak reference looks like", r)
the weak reference looks like <weakref at 0x7f29a809d638; to 'MyObject' at 0x7f29a810cf60>
it has a reference count of 2
the weak reference looks like <weakref at 0x7f29a809d638; dead>
In [4]: # lets try again, but this time we won't print obj, will just do "obj"
obj = MyObject()
In [5]: print(getrefcount(obj))
obj
2
Out[5]: <__main__.MyObject at 0x7f29a80a0c18>
In [6]: # note the "Out[5]". This is a second reference to our object
# and will keep it alive if we delete obj
r = ref(obj)
del obj
print("the weak reference looks like", r)
print("with a reference count of:", getrefcount(r()))
the weak reference looks like <weakref at 0x7f29a809db88; to 'MyObject' at 0x7f29a80a0c18>
with a reference count of: 7
In [7]: # So what happened? It's that Out[5] that is keeping the object alive.
# if we clear our Out variables it should go away...
# As it turns out Juypter keeps a number of its own variables lying around,
# so we have to reset pretty everything.
In [8]: def my_reset(*varnames):
"""
varnames are what you want to keep
"""
globals_ = globals()
to_save = {v: globals_[v] for v in varnames}
to_save['my_reset'] = my_reset # lets keep this function by default
del globals_
get_ipython().magic("reset")
globals().update(to_save)
my_reset('r') # clear everything except our weak reference to the object
# you would use this to keep "thing" around.
Once deleted, variables cannot be recovered. Proceed (y/[n])? y
In [9]: print("the weak reference looks like", r)
the weak reference looks like <weakref at 0x7f29a809db88; dead>
我遇到了同样的问题,经过数小时的努力,对我有用的解决方案非常精简。 您只需要将所有代码包含在一个单元格中。 在同一个单元格中,垃圾收集正常执行,只有在您离开单元格之后,变量才具有所有额外引用并且不可收集。
对于长笔记本,这可能是一种非常不方便且不可读的方式,但是,其想法是您可以在单元格中对该单元格中的变量执行垃圾收集。 因此,也许您可以在离开单元格之前以一种可以在单元格末尾调用gc.collect()
的方式组织您的代码。
希望这有帮助:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.