为什么以及如何 Python 函数是可散列的？

Question

I recently tried the following commands in Python:我最近在 Python 中尝试了以下命令：

>>> {lambda x: 1: 'a'}
{<function __main__.<lambda>>: 'a'}

>>> def p(x): return 1
>>> {p: 'a'}
{<function __main__.p>: 'a'}

The success of both dict creations indicates that both lambda and regular functions are hashable.两个dict创建的成功表明 lambda 和常规函数都是可散列的。 (Something like {[]: 'a'} fails with TypeError: unhashable type: 'list' ). （类似于{[]: 'a'}失败， TypeError: unhashable type: 'list' ）。

The hash is apparently not necessarily the ID of the function:哈希显然不一定是函数的 ID：

>>> m = lambda x: 1
>>> id(m)
140643045241584
>>> hash(m)
8790190327599
>>> m.__hash__()
8790190327599

The last command shows that the __hash__ method is explicitly defined for lambda s, ie, this is not some automagical thing Python computes based on the type.最后一个命令显示__hash__方法是为lambda显式定义的，也就是说，这不是 Python 基于类型计算的一些自动的东西。

What is the motivation behind making functions hashable?使函数可哈希化的动机是什么？ For a bonus, what is the hash of a function?作为奖励，函数的哈希值是多少？

Answer 1

It's nothing special.这没什么特别的。 As you can see if you examine the unbound __hash__ method of the function type:如您检查函数类型的未绑定__hash__方法__hash__ ：

>>> def f(): pass
...
>>> type(f).__hash__
<slot wrapper '__hash__' of 'object' objects>

the of 'object' objects part means it just inherits the default identity-based __hash__ from object . of 'object' objects部分意味着它只是从object继承默认的基于身份的__hash__ 。 Function == and hash work by identity.函数==和hash按身份工作。 The difference between id and hash is normal for any type that inherits object.__hash__ : id和hash之间的区别对于任何继承object.__hash__类型都是正常的：

>>> x = object()
>>> id(x)
40145072L
>>> hash(x)
2509067

You might think __hash__ is only supposed to be defined for immutable objects, and you'd be almost right, but that's missing a key detail.您可能认为__hash__只应该为不可变对象定义，而且您几乎是对的，但这缺少一个关键细节。 __hash__ should only be defined for objects where everything involved in == comparisons is immutable. __hash__应该只为==比较中涉及的所有内容都是不可变的对象定义。 For objects whose == is based on identity, it's completely standard to base hash on identity as well, since even if the objects are mutable, they can't possibly be mutable in a way that would change their identity.对于==基于身份的对象，基于身份的hash也是完全标准的，因为即使对象是可变的，它们也不可能以改变其身份的方式可变。 Files, modules, and other mutable objects with identity-based == all behave this way.具有基于身份的==文件、模块和其他可变对象都以这种方式运行。

Answer 2

It can be useful, eg, to create sets of function objects, or to index a dict by functions.它可能很有用，例如，创建函数对象集，或按函数索引字典。 Immutable objects normally support __hash__ .不可变对象通常支持__hash__ 。 In any case, there's no internal difference between a function defined by a def or by a lambda - that's purely syntactic.在任何情况下，由def或由lambda定义的函数之间没有内部差异 - 这纯粹是语法上的。

The algorithm used depends on the version of Python.使用的算法取决于 Python 的版本。 It looks like you're using a recent version of Python on a 64-bit box.看起来您正在 64 位机器上使用最新版本的 Python。 In that case, the hash of a function object is the right rotation of its id() by 4 bits, with the result viewed as a signed 64-bit integer.在这种情况下，函数对象的散列是其id()右旋转 4 位，结果被视为一个有符号的 64 位整数。 The right shift is done because object addresses ( id() results) are typically aligned so that their last 3 or 4 bits are always 0, and that's a mildly annoying property for a hash function.完成右移是因为对象地址（ id()结果）通常是对齐的，因此它们的最后 3 或 4 位始终为 0，这对于散列函数来说是一个有点烦人的属性。

In your specific example,在您的具体示例中，

>>> i = 140643045241584 # your id() result
>>> (i >> 4) | ((i << 60) & 0xffffffffffffffff) # rotate right 4 bits
8790190327599  # == your hash() result

Answer 3

A function is hashable because it is a normal, builtin, non mutable object.一个函数是可散列的，因为它是一个普通的、内置的、非可变的对象。

From the Python Manual :从Python 手册：

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() or __cmp__() method).如果一个对象的哈希值在其生命周期内永远不会改变（它需要一个__hash__()方法），并且可以与其他对象进行比较（它需要一个__eq__()或__cmp__()方法），那么它就是可哈希的。 Hashable objects which compare equal must have the same hash value.比较相等的可散列对象必须具有相同的散列值。

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.哈希能力使对象可用作字典键和集合成员，因为这些数据结构在内部使用哈希值。

All of Python's immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Python 的所有不可变内置对象都是可散列的，而没有可变容器（例如列表或字典）是可散列的。 Objects which are instances of user-defined classes are hashable by default;默认情况下，作为用户定义类实例的对象是可散列的； they all compare unequal (except with themselves), and their hash value is derived from their id() .它们都比较不相等（除了它们自己），它们的哈希值来自它们的id() 。

为什么以及如何 Python 函数是可散列的？

问题描述

3 个解决方案

解决方案1
52 已采纳 2016-07-22 05:34:56

解决方案2
24 2016-07-22 05:58:09

解决方案3
4 2016-07-22 05:42:32

为什么以及如何 Python 函数是可散列的？

问题描述

3 个解决方案

解决方案1 52 已采纳 2016-07-22 05:34:56

解决方案2 24 2016-07-22 05:58:09

解决方案3 4 2016-07-22 05:42:32

解决方案1
52 已采纳 2016-07-22 05:34:56

解决方案2
24 2016-07-22 05:58:09

解决方案3
4 2016-07-22 05:42:32