关于python：为什么对magic方法的显式调用比“sugared”语法慢？

Why are explicit calls to magic methods slower than “sugared” syntax?

当我遇到一组奇怪的计时结果时，我正忙于处理一个小的自定义数据对象，这个对象需要具有可哈希性、可比较性和快速性。这个对象的一些比较(和散列方法)只是委托给一个属性，所以我使用了如下的方法：

1 2	def __hash__(self): return self.foo.__hash__()

然而，在测试中，我发现hash(self.foo)明显更快。好奇的是，我测试了__eq__、__ne__和其他神奇的比较，结果发现，如果我使用含糖的形式(==、!=、<等)，所有这些都会跑得更快。这是为什么？我假设加糖的形式必须在引擎盖下进行相同的函数调用，但也许情况并非如此？

Timeit结果

设置：控制所有比较的实例属性周围的薄包装。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type"help","copyright","credits" or"license" for more information.
>>> import timeit
>>>
>>> sugar_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return hash(self._foo)
... def __eq__(self, other):
... return self._foo == other._foo
... def __ne__(self, other):
... return self._foo != other._foo
... def __lt__(self, other):
... return self._foo < other._foo
... def __gt__(self, other):
... return self._foo > other._foo
... '''
>>> explicit_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return self._foo.__hash__()
... def __eq__(self, other):
... return self._foo.__eq__(other._foo)
... def __ne__(self, other):
... return self._foo.__ne__(other._foo)
... def __lt__(self, other):
... return self._foo.__lt__(other._foo)
... def __gt__(self, other):
... return self._foo.__gt__(other._foo)
... '''

号测验

我的自定义对象是包装一个datetime，所以我用的就是这个，但它不应该有任何区别。是的，我在测试中创建了日期时间，所以这里显然有一些相关的开销，但是从一个测试到另一个测试的开销是恒定的，所以不应该有什么区别。我省略了__ne__和__gt__测试的简洁性，但这些结果与这里显示的结果基本相同。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

>>> test_hash = '''\
... for i in range(1, 1000):
... hash(Thin(datetime.datetime.fromordinal(i)))
... '''
>>> test_eq = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a == a # True
... a == b # False
... '''
>>> test_lt = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a < b # True
... b < a # False
... '''

结果

1
2
3
4
5
6
7
8
9
10
11
12

>>> min(timeit.repeat(test_hash, explicit_setup, number=1000, repeat=20))
1.0805227295846862
>>> min(timeit.repeat(test_hash, sugar_setup, number=1000, repeat=20))
1.0135617737162192
>>> min(timeit.repeat(test_eq, explicit_setup, number=1000, repeat=20))
2.349765956168767
>>> min(timeit.repeat(test_eq, sugar_setup, number=1000, repeat=20))
2.1486044757355103
>>> min(timeit.repeat(test_lt, explicit_setup, number=500, repeat=20))
1.156479287717275
>>> min(timeit.repeat(test_lt, sugar_setup, number=500, repeat=20))
1.0673696685109917

。

搞砸：
- 显式：1.0805227295846862
- 加糖：1.0135617737162192
平等：
- 显式：2.349765956168767
- 加糖：2.1486044757355103
小于：
- 显式：1.156479287717275
- 加糖：1.0673696685109917

两个原因：

API查找仅查看类型。他们不看self.foo.__hash__，他们看type(self.foo).__hash__。那是少了一本字典。
C槽查找比纯python属性查找(将使用__getattribute__)更快；而查找方法对象(包括描述符绑定)完全是在C中完成的，而不是绕过__getattribute__。

因此，您必须在本地缓存type(self._foo).__hash__查找，即使这样，调用也不会像从C代码进行的那样快。如果速度很快，只需坚持标准库功能即可。

避免直接调用magic方法的另一个原因是比较运算符不仅仅调用一个magic方法；这些方法也反映了版本；对于x < y，如果没有定义x.__lt__，或者x.__lt__(y)返回NotImplemented单例，那么也会参考y.__gt__(x)。