简体   繁体   English

Python中出现意外的浮点表示

[英]Unexpected floating-point representations in Python

Hello I am using a dictionary in Python storing some cities and their population like that: 您好我在Python中使用字典存储一些城市及其人口:

population = { 'Shanghai' : 17.8, 'Istanbul' : 13.3, 'Karachi' : 13.0, 'mumbai' : 12.5 }

Now if I use the command print population , I get the result: 现在,如果我使用命令print population ,我得到结果:

{'Karachi': 13.0, 'Shanghai': 17.800000000000001, 'Istanbul': 13.300000000000001, 'mumbai': 12.5}

whereas if I use the command print population['Shanghai'] I get the initial input of 17.8 . 而如果我使用命令print population['Shanghai']我得到17.8的初始输入。

My question to you is how does the 17.8 and the 13.3 turned into 17.800000000000001 and 13.300000000000001 respectively? 我的问题是17.813.3分别如何变成17.80000000000000113.300000000000001 How was all that information produced? 所有这些信息是如何产生的? And why is it stored there, since my initial input denotes that I do not need that extra information, at least as far as I know. 为什么它存储在那里,因为我的初始输入表示我不需要额外的信息,至少据我所知。

This has been changed in Python 3.1. 这已在Python 3.1中更改。 From the what's new page: 什么是新的页面:

Python now uses David Gay's algorithm for finding the shortest floating point representation that doesn't change its value. Python现在使用David Gay的算法来查找不会改变其值的最短浮点表示。 This should help mitigate some of the confusion surrounding binary floating point numbers. 这应该有助于缓解围绕二进制浮点数的一些混淆。

The significance is easily seen with a number like 1.1 which does not have an exact equivalent in binary floating point. 使用像1.1这样的数字很容易看出它的重要性,它在二进制浮点数上没有精确的等价物。 Since there is no exact equivalent, an expression like float('1.1') evaluates to the nearest representable value which is 0x1.199999999999ap+0 in hex or 1.100000000000000088817841970012523233890533447265625 in decimal. 由于没有确切的等价,因此像float('1.1')这样的表达式求值为最接近的可表示值,十六进制为0x1.199999999999ap+0或十进制为1.100000000000000088817841970012523233890533447265625 That nearest value was and still is used in subsequent floating point calculations. 该最近的值仍然用于后续浮点计算。

What is new is how the number gets displayed. 新的是如何显示数字。 Formerly, Python used a simple approach. 以前,Python使用了一种简单的方法。 The value of repr(1.1) was computed as format(1.1, '.17g') which evaluated to '1.1000000000000001' . repr(1.1)的值计算为format(1.1, '.17g') ,其评估为'1.1000000000000001' The advantage of using 17 digits was that it relied on IEEE-754 guarantees to assure that eval(repr(1.1)) would round-trip exactly to its original value. 使用17位数字的优点是它依靠IEEE-754保证确保eval(repr(1.1))完全往返到其原始值。 The disadvantage is that many people found the output to be confusing (mistaking intrinsic limitations of binary floating point representation as being a problem with Python itself). 缺点是许多人发现输出令人困惑(将二进制浮点表示的内在限制误认为是Python本身的问题)。

The new algorithm for repr(1.1) is smarter and returns '1.1' . repr(1.1)的新算法更智能并返回'1.1' Effectively, it searches all equivalent string representations (ones that get stored with the same underlying float value) and returns the shortest representation. 实际上,它会搜索所有等效的字符串表示形式(使用相同的基础浮点值存储的字符串表示形式)并返回最短的表示形式。

The new algorithm tends to emit cleaner representations when possible, but it does not change the underlying values. 新算法在可能的情况下倾向于发出更清晰的表示,但它不会改变基础值。 So, it is still the case that 1.1 + 2.2 != 3.3 even though the representations may suggest otherwise. 所以,仍然是1.1 + 2.2 != 3.3的情况,即使表示可能另有说明。

The new algorithm depends on certain features in the underlying floating point implementation. 新算法依赖于底层浮点实现中的某些功能。 If the required features are not found, the old algorithm will continue to be used. 如果未找到所需的功能,将继续使用旧算法。 Also, the text pickle protocols assure cross-platform portability by using the old algorithm. 此外,文本pickle协议通过使用旧算法确保跨平台可移植性。

(Contributed by Eric Smith and Mark Dickinson; issue 1580 ) (供稿人:Eric Smith和Mark Dickinson; issue 1580

You need to read up on how floating-point numbers work in computers. 您需要了解浮点数在计算机中的工作原理。

Basically, not all decimal numbers are possible to store exactly, and in those cases you will get the closest possible number. 基本上,并非所有十进制数都可以准确存储,在这种情况下,您将获得最接近的数字。 Sometimes this abstraction leaks, and you get to see the error. 有时这种抽象泄漏,你会看到错误。

This is probably due to differences in the printing logic used for the two use-cases you describe. 这可能是由于您描述的两个用例的打印逻辑存在差异。 I couldn't re-produce the behavior (using Python 2.7.2 in Win64). 我无法重新生成行为(在Win64中使用Python 2.7.2)。

If you use a number that is exactly representable, such as 1.5 , I would guess the effect to go away. 如果你使用一个数字 ,精确表示,如1.5 ,我猜走开的效果。

You have to use decimal.Decimal if you want to have the decimal represented exactly as you specified it on any machine in the world. 如果要在世界上任何一台机器上指定完全相同的小数,则必须使用decimal.Decimal。

See the Python manual for information: http://docs.python.org/library/decimal.html 有关信息,请参阅Python手册: http//docs.python.org/library/decimal.html

>>> from decimal import Decimal
>>> print Decimal('3.14')
3.14

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM