python中的高内存使用率

Question

The following simple python code: 以下简单的python代码：

class Node:
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
if __name__ == '__main__':
    nodes = []
    for i in xrange(1, 7 * 1000 * 1000):
        if i % 1000 == 0:
            print i
        nodes.append(Node())

takes gigabytes of memory; 占用数GB的内存； Which I think is irrational. 我认为这是不合理的。 Is that normal in python? 这在python中正常吗？

How could I fix that?(in my original code, I have about 7 million objects each with 10 fields and that takes 8 gigabytes of RAM) 我该如何解决？（在我的原始代码中，我有大约700万个对象，每个对象有10个字段，并且需要8 GB的RAM）

Answer 1

If you have fixed number of fields then you can use __slots__ to save quite a lot of memory. 如果字段的数量固定，则可以使用__slots__节省大量内存。 Note that __slots__ do have some limitations, so make sure your read the Notes on using __slots__ carefully before choosing to use them in your application: 请注意， __slots__确实有一些限制，因此在选择在应用程序中使用__slots__之前，请确保已仔细阅读有关使用__slots__的说明：

>>> import sys
>>> class Node(object):
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
...         
>>> n = Node()
>>> sys.getsizeof(n)
64
>>> class Node(object):
    __slots__ = ()
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
...         
>>> n = Node()
>>> sys.getsizeof(n)
16

Answer 2

Python is an inherently memory heavy programming language. Python是一种固有的内存密集型编程语言。 There are some ways you can get around this. 有一些方法可以解决此问题。 __slots__ is one way. __slots__是一种方法。 Another, more extreme approach is to use numpy to store your data. 另一种更极端的方法是使用numpy存储数据。 You can use numpy to create a structured array or record -- a complex data type that uses minimal memory, but suffers a substantial loss of functionality compared to a normal python class. 您可以使用numpy来创建结构化数组或记录-一种复杂的数据类型，使用最少的内存，但与普通的python类相比，其功能遭受了重大损失。 That is, you are working with the numpy array class, rather than your own class -- you cannot define your own methods on your array. 也就是说，您使用的是numpy数组类，而不是您自己的类-您无法在数组上定义自己的方法。

import numpy as np

# data type for a record with three 32-bit ints called x, y and z
dtype = [(name, np.int32) for name in 'xyz']
arr = np.zeros(1000, dtype=dtype)
# access member of x of a record
arr[0]['x'] = 1 # name based access
# or
assert arr[0][0] == 1 # index based access
# accessing all x members of records in array
assert arr['x'].sum() == 1
# size of array used to store elements in memory
assert arr.nbytes == 12000 # 1000 elements * 3 members * 4 bytes per int

See more here . 在这里查看更多。

python中的高内存使用率

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-01-07 23:50:49

解决方案2
1 2015-01-08 00:42:11

python中的高内存使用率

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-01-07 23:50:49

解决方案2 1 2015-01-08 00:42:11

解决方案1
3 已采纳 2015-01-07 23:50:49

解决方案2
1 2015-01-08 00:42:11