为什么 Python 的无穷大哈希有 π 的数字？

Question

The hash of infinity in Python has digits matching pi : Python 中无穷大的散列具有与pi匹配的数字：

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

Is that just a coincidence or is it intentional?这只是巧合还是有意为之？

Answer 1

Summary: It's not a coincidence;总结：这不是巧合； _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000 . _PyHASH_INF在 Python 的默认 CPython 实现中被硬编码为 314159 ，并在 2000 年被 Tim Peters选择为任意值（显然来自 π 的数字）。

The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3: hash(float('inf'))是数字类型的内置哈希函数的系统相关参数之一，在 Python 3 中也可以作为sys.hash_info.inf使用：

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(Same results with PyPy too.) （与 PyPy 的结果相同。）

In terms of code, hash is a built-in function.就代码而言， hash是一个内置函数。 Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type ( PyTypeObject PyFloat_Type ), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval) , which in turn has它调用一个Python浮动物体上就会调用其指针由给定的功能tp_hash属性内置浮子式（的PyTypeObject PyFloat_Type ），其是所述float_hash功能，定义为return _Py_HashDouble(v->ob_fval)这反过来又已

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

where _PyHASH_INF is defined as 314159:其中_PyHASH_INF 定义为314159：

#define _PyHASH_INF 314159

In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p ) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.就历史而言， Tim Peters在 2000 年 8 月在 Python 代码（您可以使用git bisect或git log -S 314159 -p找到它）中在此上下文中首次提及314159 ，现在提交39dce293 cpython git 存储库。

The commit message says:提交消息说：

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470 .修复http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470 。 This was a misleading bug -- the true "bug" was that hash(x) gave an error return when x is an infinity.这是一个误导性的错误——真正的“错误”是当x是无穷大时hash(x)给出了错误返回。 Fixed that.修正了那个。 Added new Py_IS_INFINITY macro to pyport.h .向pyport.h添加了新的Py_IS_INFINITY宏。 Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent's earlier stab at that to a logical conclusion.重新排列代码以减少浮点数和复数散列中不断增加的重复，将 Trent 早先的观点推向一个合乎逻辑的结论。 Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn't an error (didn't waste time trying to construct a test case, it was simply obvious from the code that it could happen).修复了极其罕见的错误，即使没有错误，浮点数的散列也可能返回 -1（没有浪费时间尝试构建测试用例，从代码中很明显它可能发生）。 Improved complex hash so that hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.改进了复杂散列，以便hash(complex(x, y))不再系统地等于hash(complex(y, x)) 。

In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);特别是，在这次提交中，他删除了Objects/floatobject.c中static long float_hash(PyFloatObject *v)的代码，并使其只return _Py_HashDouble(v->ob_fval); , and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines: , 在Objects/object.c中long _Py_HashDouble(double v)的定义中，他添加了以下Objects/object.c行：

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

So as mentioned, it was an arbitrary choice.如前所述，这是一个任意选择。 Note that 271828 is formed from the first few decimal digits of e .请注意， 271828 由e的前几个十进制数字组成。

Related later commits:相关的后续提交：

By Mark Dickinson in Apr 2010 ( also ), making the Decimal type behave similarly 作者：Mark Dickinson 在 2010 年 4 月（也），使Decimal类型的行为类似
By Mark Dickinson in Apr 2010 ( also ), moving this check to the top and adding test cases 作者：Mark Dickinson 于 2010 年 4 月（也），将此检查移至顶部并添加测试用例
By Mark Dickinson in May 2010 as issue 8188 , completely rewriting the hash function to its current implementation , but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2) 作者 Mark Dickinson 于 2010 年 5 月作为issue 8188将哈希函数完全重写为其当前实现，但保留了这个特殊情况，给常量一个名称_PyHASH_INF （也删除了 271828 这就是为什么在 Python 3 hash(float('-inf'))返回-314159而不是-271828因为它在 Python 2)
By Raymond Hettinger in Jan 2011 , adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. 作者：Raymond Hettinger 于 2011 年 1 月，在sys.hash_info Python 3.2 的“新增功能”中添加了一个显式示例，显示了上述值。 (See here .) （见这里。）
By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash. 作者：Stefan Krah 在 2012 年 3 月修改了 Decimal 模块，但保留了这个哈希值。
By Christian Heimes in Nov 2013 , moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives. Christian Heimes 于 2013 年 11 月将_PyHASH_INF的定义从Include/pyport.h Include/pyhash.h了它现在所在的Include/pyhash.h 。

Answer 2

_PyHASH_INF is defined as a constant equal to 314159 . _PyHASH_INF 定义为等于314159 的常量。

I can't find any discussion about this, or comments giving a reason.我找不到任何关于此的讨论，或给出理由的评论。 I think it was chosen more or less arbitrarily.我认为它或多或少是任意选择的。 I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.我想只要他们不对其他散列使用相同的有意义的值，就没有关系。

Answer 3

Indeed,的确，

sys.hash_info.inf

returns 314159 .返回314159 。 The value is not generated, it's built into the source code.该值不是生成的，它内置在源代码中。 In fact,实际上，

hash(float('-inf'))

returns -271828 , or approximately -e, in python 2 ( it's -314159 now ).在 python 2 中返回-271828或大约 -e （现在是 -314159 ）。

The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.有史以来最著名的两个无理数被用作哈希值这一事实使得这不太可能是巧合。

为什么 Python 的无穷大哈希有 π 的数字？

问题描述

3 个解决方案

解决方案1
224 2019-05-20 20:42:07

解决方案2
48 已采纳 2019-05-20 20:19:58

解决方案3
13 2019-05-21 16:39:13

为什么 Python 的无穷大哈希有 π 的数字？

问题描述

3 个解决方案

解决方案1 224 2019-05-20 20:42:07

解决方案2 48 已采纳 2019-05-20 20:19:58

解决方案3 13 2019-05-21 16:39:13

解决方案1
224 2019-05-20 20:42:07

解决方案2
48 已采纳 2019-05-20 20:19:58

解决方案3
13 2019-05-21 16:39:13