[英]Memory leak with large dataset when using mysql-python
I am experiencing what I believe is a memory leak when using the MySQLdb API 我在使用MySQLdb API时遇到了我认为是内存泄漏的问题
Line # Mem usage Increment Line Contents
================================================
6 @profile
7 10.102 MB 0.000 MB def main():
8 10.105 MB 0.004 MB connection = MySQLdb.connect(host="localhost", db="mydb",
9 11.285 MB 1.180 MB user="notroot", passwd="Admin123", use_unicode=True)
10 11.285 MB 0.000 MB cursor = connection.cursor(cursorclass=MySQLdb.cursors.SSCursor)
11
12 11.289 MB 0.004 MB cursor.execute("select * from a big table;")
13
14 254.078 MB 242.789 MB results = [result for result in cursor]
15 251.672 MB -2.406 MB del results
16 251.672 MB 0.000 MB return
Also when exploring the heap with guppy
/ hpy
it shows that most of my memory is occupied by unicode objects, ints and datetime objects (very likely to be to rows return by the MySQLdb API). 另外随着探索堆时,
guppy
/ hpy
这表明,大多数我的记忆是由Unicode对象,整数和日期时间对象(很可能是行由MySQLdb的API返回)所占据。
I'm using Python 2.7.3, mysql-python==1.2.4
on Ubuntu 12.04 and profiled with memory_profiler
. 我在Ubuntu 12.04上使用Python 2.7.3,
mysql-python==1.2.4
,并使用memory_profiler
进行了memory_profiler
。
Could this be interning as described in http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm ? 这可能是http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm中描述的实习生吗?
Am I missing any references dangling around? 我错过了悬挂的任何引用吗?
EDIT: I also closed the cursor and connection but still got similar results. 编辑:我也关闭了光标和连接,但仍然得到了类似的结果。
SOLVED: Facepalm. 解决: Facepalm。 I was doing a list comprehension with naturally kept everything in memory.
我正在做一个列表理解,自然地将所有内容保存在内存中。 When consuming the iterator properly (streaming to a file or something) it has decent memory usage.
正确使用迭代器(流式传输到文件或其他东西)时,它具有不错的内存使用率。
Line # Mem usage Increment Line Contents
================================================
16 @profile
17 10.055 MB 0.000 MB def main():
18 10.059 MB 0.004 MB connection = MySQLdb.connect(host="localhost", db="mydb",
19 11.242 MB 1.184 MB user="notroot", passwd="Admin123", use_unicode=True)
20 11.242 MB 0.000 MB cursor = connection.cursor(cursorclass=MySQLdb.cursors.SSCursor)
21
22 11.246 MB 0.004 MB cursor.execute("select * from big table")
23 11.246 MB 0.000 MB count = 0
24 30.887 MB 19.641 MB for result in cursor:
25 30.887 MB 0.000 MB count = count + 1
26 30.895 MB 0.008 MB cursor.close()
27 30.898 MB 0.004 MB connection.close()
28 30.898 MB 0.000 MB return
Solved by the OP. 由OP解决。 His original code contained the line
他的原始代码包含该行
results = [result for result in cursor]
This list comprehension stored the entire result in memory, rather than streaming it from the server as needed. 此列表理解将整个结果存储在内存中,而不是根据需要从服务器中流式传输。 The OP replaced it with a simple
OP用一个简单的替换它
for result in cursor:
...
and saw his memory usage go back to normal. 并看到他的记忆力恢复正常。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.