简体   繁体   English

为什么对魔术方法的显式调用比“含糖”语法慢?

[英]Why are explicit calls to magic methods slower than “sugared” syntax?

I was messing around with a small custom data object that needs to be hashable, comparable, and fast, when I ran into an odd-looking set of timing results. 当我遇到一组奇怪的时序结果时,我正在搞乱一个小的自定义数据对象需要可以清洗,可比较和快速。 Some of the comparisons (and the hashing method) for this object simply delegate to an attribute, so I was using something like: 这个对象的一些比较(和散列方法)只是委托给一个属性,所以我使用了类似的东西:

def __hash__(self):
    return self.foo.__hash__()

However upon testing, I discovered that hash(self.foo) is noticeably faster. 但是经过测试,我发现hash(self.foo)明显更快。 Curious, I tested __eq__ , __ne__ , and the other magic comparisons, only to discover that all of them ran faster if I used the sugary forms ( == , != , < , etc.). 出于好奇,我测试__eq____ne__和其他神奇的比较,才发现他们跑的更快,如果我用含糖的形式( ==!=<等)。 Why is this? 为什么是这样? I assumed the sugared form would have to make the same function call under the hood, but perhaps this isn't the case? 我认为加糖形式必须在引擎盖下进行相同的函数调用,但也许情况并非如此?

Timeit results Timeit结果

Setups: thin wrappers around an instance attribute that controls all the comparisons. 设置:围绕控制所有比较的实例属性的薄包装器。

Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> 
>>> sugar_setup = '''\
... import datetime
... class Thin(object):
...     def __init__(self, f):
...             self._foo = f
...     def __hash__(self):
...             return hash(self._foo)
...     def __eq__(self, other):
...             return self._foo == other._foo
...     def __ne__(self, other):
...             return self._foo != other._foo
...     def __lt__(self, other):
...             return self._foo < other._foo
...     def __gt__(self, other):
...             return self._foo > other._foo
... '''
>>> explicit_setup = '''\
... import datetime
... class Thin(object):
...     def __init__(self, f):
...             self._foo = f
...     def __hash__(self):
...             return self._foo.__hash__()
...     def __eq__(self, other):
...             return self._foo.__eq__(other._foo)
...     def __ne__(self, other):
...             return self._foo.__ne__(other._foo)
...     def __lt__(self, other):
...             return self._foo.__lt__(other._foo)
...     def __gt__(self, other):
...             return self._foo.__gt__(other._foo)
... '''

Tests 测试

My custom object is wrapping a datetime , so that's what I used, but it shouldn't make any difference. 我的自定义对象正在包装一个datetime ,所以这就是我使用的,但它应该没有任何区别。 Yes, I'm creating the datetimes within the tests, so there's obviously some associated overhead there, but that overhead is constant from one test to another so it shouldn't make a difference. 是的,我在测试中创建了日期时间,所以显然有一些相关的开销,但是从一个测试到另一个测试的开销是不变的,所以它不应该有所作为。 I've omitted the __ne__ and __gt__ tests for brevity, but those results were essentially identical to the ones shown here. 为简洁起见,我省略了__ne____gt__测试,但这些结果与此处显示的结果基本相同。

>>> test_hash = '''\
... for i in range(1, 1000):
...     hash(Thin(datetime.datetime.fromordinal(i)))
... '''
>>> test_eq = '''\
... for i in range(1, 1000):
...     a = Thin(datetime.datetime.fromordinal(i))
...     b = Thin(datetime.datetime.fromordinal(i+1))
...     a == a # True
...     a == b # False
... '''
>>> test_lt = '''\
... for i in range(1, 1000):
...     a = Thin(datetime.datetime.fromordinal(i))
...     b = Thin(datetime.datetime.fromordinal(i+1))
...     a < b # True
...     b < a # False
... '''

Results 结果

>>> min(timeit.repeat(test_hash, explicit_setup, number=1000, repeat=20))
1.0805227295846862
>>> min(timeit.repeat(test_hash, sugar_setup, number=1000, repeat=20))
1.0135617737162192
>>> min(timeit.repeat(test_eq, explicit_setup, number=1000, repeat=20))
2.349765956168767
>>> min(timeit.repeat(test_eq, sugar_setup, number=1000, repeat=20))
2.1486044757355103
>>> min(timeit.repeat(test_lt, explicit_setup, number=500, repeat=20))
1.156479287717275
>>> min(timeit.repeat(test_lt, sugar_setup, number=500, repeat=20))
1.0673696685109917
  • Hash: 哈希:
    • Explicit: 1.0805227295846862 明确: 1.0805227295846862
    • Sugared: 1.0135617737162192 加糖: 1.0135617737162192
  • Equal: 等于:
    • Explicit: 2.349765956168767 明确: 2.349765956168767
    • Sugared: 2.1486044757355103 加糖: 2.1486044757355103
  • Less Than: 少于:
    • Explicit: 1.156479287717275 明确: 1.156479287717275
    • Sugared: 1.0673696685109917 加糖: 1.0673696685109917

Two reasons: 两个原因:

  • The API lookups look at the type only. API查找仅查看类型 They don't look at self.foo.__hash__ , they look for type(self.foo).__hash__ . 他们不看self.foo.__hash__ ,他们寻找type(self.foo).__hash__ That's one less dictionary to look in. 这是一个较少的字典。

  • The C slot lookup is faster than the pure-Python attribute lookup (which will use __getattribute__ ); C槽查找比纯Python属性查找(将使用__getattribute__ )更快; instead looking up the method objects (including the descriptor binding) is done entirely in C, bypassing __getattribute__ . 相反,查找方法对象(包括描述符绑定)完全在C中完成,绕过__getattribute__

So you'd have to cache the type(self._foo).__hash__ lookup locally, and even then the call would not be as fast as from C code. 所以你必须在本地缓存type(self._foo).__hash__查找,即使这样,调用也不会像C代码一样快。 Just stick to the standard library functions if speed is at a premium. 如果速度非常快,请坚持使用标准库函数。

Another reason to avoid calling the magic methods directly is that the comparison operators do more than just call one magic method; 另一个原因,以避免直接调用魔术方法是比较运营商做不仅仅是一个调用魔术方法; the methods have reflected versions too; 方法也反映了版本; for x < y , if x.__lt__ isn't defined or x.__lt__(y) returns the NotImplemented singleton, y.__gt__(x) is consulted as well. 对于x < y ,如果未定义x.__lt__x.__lt__(y)返回NotImplemented单例,则还会查询y.__gt__(x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM