简体   繁体   English

__slots__如何避免字典查找?

[英]How does __slots__ avoid a dictionary lookup?

I've heard that __slots__ makes objects faster by avoiding a dictionary lookup. 我听说__slots__通过避免字典查找使对象更快。 My confusion comes from Python being a dynamic language. 我的困惑来自于Python是一种动态语言。 In a static language, we avoid a dictionary lookup for a.test by doing a compile-time optimisation to save the index in the instruction we run. 在静态语言中,我们通过执行编译时优化来避免对a.test进行字典查找, a.test在我们运行的指令中保存索引。

Now, in Python, a could just as easily be another object that has a dictionary or a different set of attributes. 现在,在Python中, a可以很容易地成为另一个具有字典或不同属性集的对象。 It seems like we'll still have to do a dictionary lookup - the only difference seems to be that we only need one dictionary for the class, rather than a dictionary for each object. 看起来我们仍然需要进行字典查找 - 唯一的区别似乎是我们只需要一个类的字典,而不是每个对象的字典。

With this rational, 有了这个理性,

  1. How does __slots__ avoid a dictionary lookup? __slots__如何避免字典查找?
  2. Does slots make accessing objects faster? 插槽是否可以更快地访问对象?

__slots__ does not (significantly) speed up attribute access: __slots__没有(显着)加速属性访问:

>>> class Foo(object):
...     __slots__ = ('spam',)
...     def __init__(self):
...         self.spam = 'eggs'
... 
>>> class Bar(object):
...     def __init__(self):
...         self.spam = 'eggs'
... 
>>> import timeit
>>> timeit.timeit('t.spam', 'from __main__ import Foo; t=Foo()')
0.07030296325683594
>>> timeit.timeit('t.spam', 'from __main__ import Bar; t=Bar()')
0.07646608352661133

The goal of using __slots__ is to save memory ; 使用__slots__节省内存 ; instead of using a .__dict__ mapping on the instance, the class has descriptors objects for each and every attribute named in __slots__ and instances have the attribute assigned wether or not they have an actual value: 而不是在实例上使用.__dict__映射,该类具有__slots__命名的每个属性的描述符对象 ,并且实例具有分配的属性, 是否具有实际值:

>>> class Foo(object):
...     __slots__ = ('spam',)
... 
>>> dir(Foo())
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', 'spam']
>>> Foo().spam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: spam
>>> Foo.spam
<member 'spam' of 'Foo' objects>
>>> type(Foo.spam)
<type 'member_descriptor'>

So python still has to look at the class for each attribute access on an instance of Foo (to find the descriptor). 因此,python仍然需要查看Foo实例上每个属性访问的类(以查找描述符)。 Any unknown attribute (say, Foo.ham ) will still result in Python looking through the class MRO to search for that attribute, and that includes dictionary searches. 任何未知属性(例如, Foo.ham )仍然会导致Python通过类MRO查找该属性,其中包括字典搜索。 And you can still assign additional attributes to the class : 您仍然可以为该类分配其他属性:

>>> Foo.ham = 'eggs'
>>> dir(Foo)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', 'ham', 'spam']
>>> Foo().ham
'eggs'

The slot descriptors are created when the class is created, and access memory assigned to each instance to store and retrieve a reference to the associated value (the same chunk of memory that tracks instance reference counts and a reference back to the class object). 创建类时创建槽描述符,并为每个实例分配访问内存以存储和检索对相关值的引用(跟踪实例引用计数的同一块内存和返回类对象的引用)。 Without slots, a descriptor for __dict__ is used accessing a reference to a dict object in the same manner. 如果没有插槽,则使用__dict__的描述符以相同的方式访问对dict对象的引用。

It might speed up a program where you instantiate lots of objects of the same class, genuinely never change what attributes they have, and cache misses on all those duplicate dictionaries present a real performance problem. 它可能会加速一个程序,你可以在其中实例化同一类的大量对象,真正永远不会更改它们具有的属性,并且所有这些重复字典上的缓存未命中都会带来真正的性能问题。

This is really just a special case of the general situation where saving space sometimes saves time as well, where cache is the limiting factor. 这实际上只是一般情况的特例,其中节省空间有时也节省了时间,其中缓存是限制因素。

So, it probably won't make accessing one object faster, but may speed up accessing many objects of the same type. 因此,它可能不会更快地访问一个对象,但可能会加快访问相同类型的许多对象。

See also this question . 另见这个问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM