[英]Does PEP 412 make __slots__ redundant?
PEP 412 , implemented in Python 3.3, introduces improved handling of attribute dictionaries, effectively reducing the memory footprint of class instances.在 Python 3.3 中实现的PEP 412引入了改进的属性字典处理,有效地减少了类实例的内存占用。 __slots__
was designed for the same purpose, so is there any point in using __slots__
any more? __slots__
是为同样的目的而设计的,那么再使用__slots__
有什么意义吗?
In an attempt to find out the answer myself, I run the following test, but the results don't make much sense:为了自己找出答案,我运行了以下测试,但结果没有多大意义:
class Slots(object):
__slots__ = ['a', 'b', 'c', 'd', 'e']
def __init__(self):
self.a = 1
self.b = 1
self.c = 1
self.d = 1
self.e = 1
class NoSlots(object):
def __init__(self):
self.a = 1
self.b = 1
self.c = 1
self.d = 1
self.e = 1
Python 3.3 Results: Python 3.3 结果:
>>> sys.getsizeof([Slots() for i in range(1000)])
Out[1]: 9024
>>> sys.getsizeof([NoSlots() for i in range(1000)])
Out[1]: 9024
Python 2.7 Results: Python 2.7 结果:
>>> sys.getsizeof([Slots() for i in range(1000)])
Out[1]: 4516
>>> sys.getsizeof([NoSlots() for i in range(1000)])
Out[1]: 4516
I would have expected the size to differ at least for Python 2.7, so I assume there is something wrong with the test.我预计至少 Python 2.7 的大小会有所不同,所以我假设测试有问题。
No, PEP 412 does not make __slots__
redundant.不,PEP 412不会使__slots__
变得多余。
First, Armin Rigo is right that you're not measuring it properly.首先,Armin Rigo 是对的,您没有正确测量它。 What you need to measure is the size of the object, plus the values, plus the __dict__
itself (for NoSlots
only) and the keys (for NoSlots
only).您需要测量的是对象的大小、值、 NoSlots
__dict__
和键(仅适用于NoSlots
)。
Or you could do what he suggests:或者你可以按照他的建议去做:
cls = Slots if len(sys.argv) > 1 else NoSlots
def f():
tracemalloc.start()
objs = [cls() for _ in range(100000)]
print(tracemalloc.get_traced_memory())
f()
When I run this on 64-bit CPython 3.4 on OS X, I get 8824968
for Slots
and 25624872
for NoSlots
.当我在 OS X 的 64 位 CPython 3.4 上运行它时,我得到8824968
的Slots
和25624872
的NoSlots
。 So, it looks like a NoSlots
instance takes 88 bytes, while a Slots
instance takes 256 bytes.所以,看起来NoSlots
实例需要 88 个字节,而Slots
实例需要 256 个字节。
How is this possible?这怎么可能?
Because there are still two differences between __slots__
and a key-split __dict__
.因为__slots__
和键拆分__dict__
之间仍然存在两个差异。
First, the hash tables used by dictionaries are kept below 2/3rds full, and they grow exponentially and have a minimum size, so you're going to have some extra space.首先,字典使用的哈希表保持低于 2/3 满,并且它们呈指数增长并具有最小大小,因此您将有一些额外的空间。 And it's not hard to work out how much space by looking at the nicely-commented source : you're going to have 8 hash buckets instead of 5 slots pointers.通过查看评论很好的 来源,不难计算出有多少空间:您将拥有 8 个哈希桶而不是 5 个槽指针。
Second, the dictionary itself isn't free;其次,字典本身不是免费的; it has a standard object header, a count, and two pointers.它有一个标准的对象头、一个计数和两个指针。 That might not sound like a lot, but when you're talking about an object that's only got a few attributes (note that most objects only have a few attributes…), the dict header can make as much difference as the hash table.这听起来可能不是很多,但是当你谈论一个只有几个属性的对象时(请注意,大多数对象只有几个属性......),dict 头可以像哈希表一样产生很大的不同。
And of course in your example, the values, so the only cost involved here is the object itself, plus the the 5 slots or 8 hash buckets and dict header, so the difference is pretty dramatic.当然,在您的示例中,值,所以这里涉及的唯一成本是对象本身,加上 5 个槽或 8 个哈希桶和 dict 标头,所以差异非常显着。 In real life, __slots__
will rarely be that much of a benefit.在现实生活中, __slots__
很少有那么大的好处。
Finally, notice that PEP 412 only claims:最后,请注意 PEP 412 仅声称:
Benchmarking shows that memory use is reduced by 10% to 20% for object-oriented programs基准测试显示面向对象程序的内存使用减少了 10% 到 20%
Think about where you use __slots__
.想想你在哪里使用__slots__
。 Either the savings are so huge that not using __slots__
would be ridiculous, or you really need to squeeze out that last 15%.要么节省的钱太多以至于不使用__slots__
会很荒谬,要么你真的需要挤出最后的 15%。 Or you're building an ABC or other class that you expect to be subclassed by who-knows-what and the subclasses might need the savings.或者,您正在构建一个 ABC 或其他类,您希望这些类被谁知道是什么子类化,而这些子类可能需要节省。 At any rate, in those cases, the fact that you get half the benefit without __slots__
, or even two thirds the benefit, is still rarely going to be enough;无论如何,在这些情况下,即使没有__slots__
也能获得一半的收益,甚至三分之二的收益,这仍然远远不够; you'll still need to use __slots__
.你仍然需要使用__slots__
。
The real win is in the cases where it isn't worth using __slots__
;真正的胜利是在不值得使用__slots__
的情况下; you'll get a small benefit for free.您将免费获得小额福利。
(Also, there are definitely some programmers who overuse the hell out of __slots__
, and maybe this change can convince some of them to put their energy into micro optimizing something else not quite as irrelevant, if you're lucky.) (此外,肯定有一些程序员过度使用了__slots__
,也许这个变化可以说服他们中的一些人将他们的精力投入到微优化其他不太无关紧要的事情上,如果你幸运的话。)
The problem is sys.getsizeof()
, which rarely returns what you expect.问题是sys.getsizeof()
,它很少返回您期望的结果。 For example in this case it counts the "size" of an object without accounting for the size of its __dict__
.例如,在这种情况下,它计算对象的“大小”而不考虑其__dict__
的大小。 I suggest you retry by measuring the real memory usage of creating 100'000 instances.我建议您通过测量创建 100'000 个实例的实际内存使用量来重试。
Note also that the Python 3.3 behavior was inspired by PyPy, in which __slots__
makes no difference, so I would expect it to make no difference in Python 3.3 too.另请注意,Python 3.3 的行为受到 PyPy 的启发,其中__slots__
没有区别,因此我希望它在 Python 3.3 中也没有区别。 As far as I can tell, __slots__
is almost never of any use now.据我所知, __slots__
现在几乎没有任何用处。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.