[英]Python memory consumption of objects and process
我寫了以下代碼:
from hurry.size import size
from pysize import get_zise
import os
import psutil
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
objects = make_a_call()
print "total size of objects is " + (get_size(objects))
print "process consumes " + size(process.memory_info().rss)
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
get_size()
使用此代碼返回對象的內存消耗。
我得到以下照片:
process consumes 21M
start method
total size of objects is 20M
process consumes 29M
exit method
process consumes 29M
對象永遠不會被明確銷毀; 然而,當它們變得無法到達時,它們可能被垃圾收集。 允許實現推遲垃圾收集或完全省略垃圾收集 - 只要沒有收集到仍然可以訪問的對象,實現垃圾收集的實現方式就是如此。
CPython實現細節:CPython目前使用引用計數方案和(可選)延遲檢測循環鏈接垃圾,一旦它們變得無法訪問就收集大多數對象,但不保證收集包含循環引用的垃圾。 有關控制循環垃圾收集的信息,請參閱gc模塊的文檔。 其他實現的行為不同,CPython可能會改變。 當對象無法訪問時,不要依賴於對象的立即終結(因此您應該始終明確地關閉文件)。
這是一個完全有效的(python 2.7)示例,它有同樣的問題(為了簡單起見,我稍微更新了原始代碼)
from hurry.filesize import size
from pysize import get_size
import os
import psutil
def make_a_call():
return range(1000000)
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print"process consumes ", size(process.memory_info().rss)
objects = make_a_call()
# FIXME
print "total size of objects is ", size(get_size(objects))
print "process consumes ", size(process.memory_info().rss)
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
main()
這是輸出:
process consumes 7M
start method
process consumes 7M
total size of objects is 30M
process consumes 124M
exit method
process consumes 124M
差異是~100Mb
這是代碼的固定版本:
from hurry.filesize import size
from pysize import get_size
import os
import psutil
def make_a_call():
return range(1000000)
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print"process consumes ", size(process.memory_info().rss)
objects = make_a_call()
print "process consumes ", size(process.memory_info().rss)
print "total size of objects is ", size(get_size(objects))
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
main()
這是更新的輸出:
process consumes 7M
start method
process consumes 7M
process consumes 38M
total size of objects is 30M
exit method
process consumes 124M
你發現了差異嗎? 您在測量最終過程大小之前計算對象大小,這會導致額外的內存消耗。 讓我們檢查它為什么會發生 - 這是https://github.com/bosswissam/pysize/blob/master/pysize.py的來源:
import sys
import inspect
def get_size(obj, seen=None):
"""Recursively finds size of objects in bytes"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if hasattr(obj, '__dict__'):
for cls in obj.__class__.__mro__:
if '__dict__' in cls.__dict__:
d = cls.__dict__['__dict__']
if inspect.isgetsetdescriptor(d) or inspect.ismemberdescriptor(d):
size += get_size(obj.__dict__, seen)
break
if isinstance(obj, dict):
size += sum((get_size(v, seen) for v in obj.values()))
size += sum((get_size(k, seen) for k in obj.keys()))
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum((get_size(i, seen) for i in obj))
return size
這里發生了很多事情! 最值得注意的是它包含它在集合中看到的所有對象以解析循環引用。 如果刪除該行,則無論如何都不會占用太多內存。
如果你創建一個大對象並再次刪除它,Python可能已經釋放了內存,但所涉及的內存分配器不一定會將內存返回給操作系統,所以看起來好像Python進程使用了更多的虛擬內存而不是它實際使用。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.