简体   繁体   English

为什么 Python 的无穷大哈希有 π 的数字?

[英]Why does Python's hash of infinity have the digits of π?

The hash of infinity in Python has digits matching pi : Python 中无穷大的散列具有与pi匹配的数字:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

Is that just a coincidence or is it intentional?这只是巧合还是有意为之?

Summary: It's not a coincidence;总结:这不是巧合; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000 . _PyHASH_INF在 Python 的默认 CPython 实现中被硬编码为 314159 ,并在 2000 年被 Tim Peters选择为任意值(显然来自 π 的数字)。


The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3: hash(float('inf'))是数字类型的内置哈希函数的系统相关参数之一,在 Python 3 中也可以作为sys.hash_info.inf使用:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(Same results with PyPy too.) 与 PyPy 的结果相同。)


In terms of code, hash is a built-in function.就代码而言, hash是一个内置函数。 Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type ( PyTypeObject PyFloat_Type ), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval) , which in turn has它调用一个Python浮动物体上就会调用其指针由给定的功能tp_hash属性内置浮子式(的PyTypeObject PyFloat_Type ),其所述float_hash功能, 定义return _Py_HashDouble(v->ob_fval)这反过来又

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

where _PyHASH_INF is defined as 314159:其中_PyHASH_INF 定义为314159:

#define _PyHASH_INF 314159

In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p ) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.就历史而言, Tim Peters在 2000 年 8 月在 Python 代码(您可以使用git bisectgit log -S 314159 -p找到它)中在此上下文中首次提及314159 ,现在提交39dce293 cpython git 存储库。

The commit message says:提交消息说:

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470 .修复http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470 This was a misleading bug -- the true "bug" was that hash(x) gave an error return when x is an infinity.这是一个误导性的错误——真正的“错误”是当x是无穷大时hash(x)给出了错误返回。 Fixed that.修正了那个。 Added new Py_IS_INFINITY macro to pyport.h .pyport.h添加了新的Py_IS_INFINITY宏。 Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent's earlier stab at that to a logical conclusion.重新排列代码以减少浮点数和复数散列中不断增加的重复,将 Trent 早先的观点推向一个合乎逻辑的结论。 Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn't an error (didn't waste time trying to construct a test case, it was simply obvious from the code that it could happen).修复了极其罕见的错误,即使没有错误,浮点数的散列也可能返回 -1(没有浪费时间尝试构建测试用例,从代码中很明显它可能发生)。 Improved complex hash so that hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.改进了复杂散列,以便hash(complex(x, y))不再系统地等于hash(complex(y, x))

In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);特别是,在这次提交中,他删除了Objects/floatobject.cstatic long float_hash(PyFloatObject *v)的代码,并使其只return _Py_HashDouble(v->ob_fval); , and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines: , 在Objects/object.clong _Py_HashDouble(double v)的定义中,他添加了以下Objects/object.c行:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

So as mentioned, it was an arbitrary choice.如前所述,这是一个任意选择。 Note that 271828 is formed from the first few decimal digits of e .请注意, 271828 由e的前几个十进制数字组成。

Related later commits:相关的后续提交:

_PyHASH_INF is defined as a constant equal to 314159 . _PyHASH_INF 定义为等于314159 的常量

I can't find any discussion about this, or comments giving a reason.我找不到任何关于此的讨论,或给出理由的评论。 I think it was chosen more or less arbitrarily.我认为它或多或少是任意选择的。 I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.我想只要他们不对其他散列使用相同的有意义的值,就没有关系。

Indeed,的确,

sys.hash_info.inf

returns 314159 .返回314159 The value is not generated, it's built into the source code.该值不是生成的,它内置在源代码中。 In fact,实际上,

hash(float('-inf'))

returns -271828 , or approximately -e, in python 2 ( it's -314159 now ).在 python 2 中返回-271828或大约 -e ( 现在是 -314159 )。

The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.有史以来最著名的两个无理数被用作哈希值这一事实使得这不太可能是巧合。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我在 python 中使用 odeint 来求解微分方程,为什么我的质量的 position 发散到无穷大? - I have used odeint in python to solve a differential equation, why does my position of my masses diverge to infinity? 在Python中,为什么最低(最小)浮点值具有与最大值不同的有效数字? - In Python, why does the lowest (smallest) floating point value have different significant-digits than the largest? 为什么这个 python 循环会走向无穷大? - Why is this python loop going to infinity? 为什么Python的元组没有任何方法? - Why does Python's tuple not have any method? 为什么 Python 的列表没有 shift/unshift 方法? - Why Python's list does not have shift/unshift methods? python 在 matlab 中是否有相应的 digits() 和 vpa() function? - Does python have counterparts of digits() and vpa() function in matlab? 什么时候计算python对象的哈希值,为什么-1的哈希值不同? - When is a python object's hash computed and why is the hash of -1 different? 为什么 hash() 方法在 Python 中返回带有 int 的短 Hash 值? - Why does hash() method return short Hash value with int in Python? 如果使用中文字符,为什么php md5()与python的hash.md5()总是不同? - Why does php md5() always different from python's hash.md5() if using chinese character? 为什么 str(float) 在 Python 3 中比 Python 2 返回更多的数字? - Why does str(float) return more digits in Python 3 than Python 2?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM