简体   繁体   English

在 Python 中强制垃圾收集以释放内存

[英]Force garbage collection in Python to free memory

I have a Python2.7 App which used lots of dict objects which mostly contain strings for keys and values.我有一个 Python2.7 应用程序,它使用了很多dict对象,这些对象主要包含键和值的字符串。

Sometimes those dicts and strings are not needed anymore and I would like to remove those from memory.有时不再需要这些字典和字符串,我想从内存中删除它们。

I tried different things, del dict[key] , del dict , etc. But the App still uses the same amount of memory.我尝试了不同的东西, del dict[key]del dict等。但应用程序仍然使用相同数量的内存。

Below a example which I would expect to fee the memory.下面是我希望为内存收费的示例。 But it doesn't :(但它没有:(

import gc
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0,1)
    )

mem()

print('...creating list of dicts...')
n = 10000
l = []
for i in xrange(n):
    a = 1000*'a'
    b = 1000*'b'
    l.append({ 'a' : a, 'b' : b })

mem()

print('...deleting list items...')

for i in xrange(n):
    l.pop(0)

mem()

print('GC collected objects : %d' % gc.collect())

mem()

Output:输出:

Memory usage         :  4.30 MB
...creating list of dicts...
Memory usage         :  36.70 MB
...deleting list items...
Memory usage         :  36.70 MB
GC collected objects : 0
Memory usage         :  36.70 MB

I would expect here some objects to be 'collected' and some memory to be freed.我希望在这里“收集”一些对象并释放一些内存。

Am I doing something wrong?难道我做错了什么? Any other ways to delete unused objects or a least to find where the objects are unexpectedly used.删除未使用的对象或至少找到意外使用对象的位置的任何其他方法。

Frederick Lundh explains , 弗雷德里克·伦德 (Frederick Lundh) 解释说

If you create a large object and delete it again, Python has probably released the memory, but the memory allocators involved don't necessarily return the memory to the operating system, so it may look as if the Python process uses a lot more virtual memory than it actually uses.如果你创建了一个大对象并再次删除它,Python 可能已经释放了内存,但是所涉及的内存分配器不一定会将内存返回给操作系统,因此看起来 Python 进程使用了​​更多的虚拟内存比它实际使用。

and Alex Martelli writes : 亚历克斯·马泰利写道

The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates.确保大量但临时的内存使用在完成后将所有资源返回给系统的唯一真正可靠的方法是让该使用发生在子进程中,该子进程会执行需要内存的工作然后终止。

So, you could use multiprocessing to spawn a subprocess, perform the memory-hogging calculation, and then ensure the memory is released when the subprocess terminates:因此,您可以使用multiprocessing来生成子进程,执行内存占用计算,然后确保在子进程终止时释放内存:

import multiprocessing as mp
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0,1)
    )

mem()

def memoryhog():
    print('...creating list of dicts...')
    n = 10**5
    l = []
    for i in xrange(n):
        a = 1000*'a'
        b = 1000*'b'
        l.append({ 'a' : a, 'b' : b })
    mem()

proc = mp.Process(target=memoryhog)
proc.start()
proc.join()

mem()

yields产量

Memory usage         :  5.80 MB
...creating list of dicts...
Memory usage         :  234.20 MB
Memory usage         :  5.90 MB

That might be somewhat useful, using multiprocessing and a library called Ray which uses shared memory to perform multi-gb data sharing between processes.这可能有点有用,使用多处理和一个名为 Ray 的库,它使用共享内存在进程之间执行多 GB 数据共享。 This way is easy to spawn a secondary process and still access the same objects quick and easy from the parent process.这种方式很容易生成辅助进程,并且仍然可以快速轻松地从父进程访问相同的对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM