简体   繁体   English

当四舍五入为 n 个有效十进制数字时,用于确定两个数字是否几乎相等的函数

[英]Function to determine if two numbers are nearly equal when rounded to n significant decimal digits

I have been asked to test a library provided by a 3rd party.我被要求测试第三方提供的库。 The library is known to be accurate to n significant figures.已知该库精确到n 个有效数字。 Any less-significant errors can safely be ignored.任何不太重要的错误都可以安全地忽略。 I want to write a function to help me compare the results:我想写一个函数来帮助我比较结果:

def nearlyequal( a, b, sigfig=5 ):

The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal.此函数的目的是确定两个浮点数(a 和 b)是否近似相等。 The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written in decimal.如果 a==b(完全匹配)或者如果 a 和 b 在四舍五入为sigfig有效数字时具有相同的值,则该函数将返回 True 以十进制写入。

Can anybody suggest a good implementation?有人可以建议一个好的实施吗? I've written a mini unit-test.我写了一个迷你单元测试。 Unless you can see a bug in my tests then a good implementation should pass the following:除非你能在我的测试中看到一个错误,否则一个好的实现应该通过以下几点:

assert nearlyequal(1, 1, 5) 
assert nearlyequal(1.0, 1.0, 5) 
assert nearlyequal(1.0, 1.0, 5) 
assert nearlyequal(-1e-9, 1e-9, 5) 
assert nearlyequal(1e9, 1e9 + 1 , 5) 
assert not nearlyequal( 1e4, 1e4 + 1, 5) 
assert nearlyequal( 0.0, 1e-15, 5 ) 
assert not nearlyequal( 0.0, 1e-4, 6 ) 

Additional notes:补充说明:

  1. Values a and b might be of type int, float or numpy.float64.值 a 和 b 可能是 int、float 或 numpy.float64 类型。 Values a and b will always be of the same type.值 a 和 b 将始终属于同一类型。 It's vital that conversion does not introduce additional error into the function.转换不会在函数中引入额外的错误,这一点至关重要。
  2. Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal.让我们保持这个数值,因此转换为字符串或使用非数学技巧的函数并不理想。 This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.该程序将由一位数学家进行审核,他希望能够证明该函数完成了它应该做的事情。
  3. Speed... I've got to compare a lot of numbers so the faster the better.速度……我得比较很多数字,所以越快越好。
  4. I've got numpy, scipy and the standard-library.我有 numpy、scipy 和标准库。 Anything else will be hard for me to get, especially for such a small part of the project.其他任何东西对我来说都很难得到,尤其是对于项目的这么一小部分。

As of Python 3.5, the standard way to do this (using the standard library) is with the math.isclose function.从 Python 3.5 开始,执行此操作的标准方法(使用标准库)是使用math.isclose函数。

It has the following signature:它具有以下签名:

isclose(a, b, rel_tol=1e-9, abs_tol=0.0)

An example of usage with absolute error tolerance:具有绝对容错的用法示例:

from math import isclose
a = 1.0
b = 1.00000001
assert isclose(a, b, abs_tol=1e-8)

If you want it with precision of n significant digits, simply replace the last line with:如果您希望它具有n 个有效数字的精度,只需将最后一行替换为:

assert isclose(a, b, abs_tol=10**-n)

There is a function assert_approx_equal in numpy.testing (source here) which may be a good starting point.还有一个功能assert_approx_equalnumpy.testing (源在这里)这可能是一个很好的起点。

def assert_approx_equal(actual,desired,significant=7,err_msg='',verbose=True):
    """
    Raise an assertion if two items are not equal up to significant digits.

    .. note:: It is recommended to use one of `assert_allclose`,
              `assert_array_almost_equal_nulp` or `assert_array_max_ulp`
              instead of this function for more consistent floating point
              comparisons.

    Given two numbers, check that they are approximately equal.
    Approximately equal is defined as the number of significant digits
    that agree.

Here's a take.这是一个问题。

def nearly_equal(a,b,sig_fig=5):
    return ( a==b or 
             int(a*10**sig_fig) == int(b*10**sig_fig)
           )

I believe your question is not defined well enough, and the unit-tests you present prove it:我相信您的问题定义得不够好,您提供的单元测试证明了这一点:

If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test assert nearlyequal(1e9, 1e9 + 1 , 5) should fail, because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.如果“舍入到 N sig-fig 小数位”的意思是“小数点右侧的 N 个小数位”,那么测试assert nearlyequal(1e9, 1e9 + 1 , 5)应该失败,因为即使你舍入 1000000000和 1000000001 到 0.00001 的精度,它们仍然不同。

And if by 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test assert nearlyequal(-1e-9, 1e-9, 5) should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.如果“四舍五入到 N 个 sig-fig 小数位”是指“N 个最高有效数字,无论小数点如何”,那么测试assert nearlyequal(-1e-9, 1e-9, 5)应该失败,因为以这种方式查看时,0.000000001 和 -0.000000001 是完全不同的。

If you meant the first definition, then the first answer on this page (by Triptych) is good.如果您指的是第一个定义,那么此页面上的第一个答案(由 Triptych 提供)很好。 If you meant the second definition, please say it, I promise to think about it :-)如果您指的是第二个定义,请说出来,我保证会考虑一下:-)

There are already plenty of great answers, but here's a think:已经有很多很好的答案,但这里有一个想法:

def closeness(a, b):
  """Returns measure of equality (for two floats), in unit
     of decimal significant figures."""
  if a == b:
    return float("infinity")
  difference = abs(a - b)
  avg = (a + b)/2
  return math.log10( avg / difference )


if closeness(1000, 1000.1) > 3:
  print "Joy!"

"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.十进制中的“有效数字”是调整小数点并截断为整数的问题。

>>> int(3.1415926 * 10**3)
3141
>>> int(1234567 * 10**-3)
1234
>>>

This is a fairly common issue with floating point numbers.这是浮点数的一个相当普遍的问题。 I solve it based on the discussion in Section 1.5 of Demmel[1].我根据 Demmel[1] 的第 1.5 节中的讨论解决了它。 (1) Calculate the roundoff error. (1) 计算舍入误差。 (2) Check that the roundoff error is less than some epsilon. (2) 检查舍入误差是否小于某个 epsilon。 I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.我有一段时间没有使用过 python 并且只有 2.4.3 版,但我会尝试使其正确。

Step 1. Roundoff error步骤 1. 舍入误差

def roundoff_error(exact, approximate):
    return abs(approximate/exact - 1.0)

Step 2. Floating point equality步骤 2. 浮点相等

def float_equal(float1, float2, epsilon=2.0e-9):
    return (roundoff_error(float1, float2) < epsilon)

There are a couple obvious deficiencies with this code.这段代码有几个明显的缺陷。

  1. Division by zero error if the exact value is Zero.如果精确值为零,则除以零错误。
  2. Does not verify that the arguments are floating point values.不验证参数是否为浮点值。

Revision 1.修订版 1。

def roundoff_error(exact, approximate):
    if (exact == 0.0 or approximate == 0.0):
        return abs(exact + approximate)
    else:
        return abs(approximate/exact - 1.0)

def float_equal(float1, float2, epsilon=2.0e-9):
    if not isinstance(float1,float):
        raise TypeError,"First argument is not a float."
    elif not isinstance(float2,float):
        raise TypeError,"Second argument is not a float."
    else:
        return (roundoff_error(float1, float2) < epsilon)

That's a little better.这样好一些。 If either the exact or the approximate value is zero, than the error is equal to the value of the other.如果精确值或近似值为零,则误差等于另一个值。 If something besides a floating point value is provided, a TypeError is raised.如果提供了浮点值以外的其他内容,则会引发 TypeError。

At this point, the only difficult thing is setting the correct value for epsilon.此时,唯一困难的是为 epsilon 设置正确的值。 I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon.我在 2.6.1 版的文档中注意到 sys.float_info 中有一个 epsilon 属性,因此我将使用该值的两倍作为默认 epsilon。 But the correct value depends on both your application and your algorithm.但正确的值取决于您的应用程序和算法。

[1] James W. Demmel, Applied Numerical Linear Algebra , SIAM, 1997. [1] James W. Demmel,应用数值线性代数,SIAM,1997。

There is a interesting solution to this by B. Dawson (with C++ code) at "Comparing Floating Point Numbers" . B. Dawson(使用 C++ 代码)在“比较浮点数”中有一个有趣的解决方案。 His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.他的方法依赖于两个数字的严格 IEEE 表示以及当所述数字表示为无符号整数时的强制字典顺序。

I have been asked to test a library provided by a 3rd party我被要求测试第三方提供的库

If you are using the default Pythonunittest framework , you can use assertAlmostEqual如果您使用默认的 Pythonunittest框架,则可以使用assertAlmostEqual

self.assertAlmostEqual(a, b, places=5)

Oren Shemesh got part of the problem with the problem as stated but there's more:如上所述,Oren Shemesh 解决了部分问题,但还有更多问题:

assert nearlyequal( 0.0, 1e-15, 5 )断言几乎相等( 0.0, 1e-15, 5 )

also fails the second definition (and that's the definition I learned in school.)也没有通过第二个定义(这是我在学校学到的定义。)

No matter how many digits you are looking at, 0 will not equal a not-zero.无论您查看多少位数字,0 都不等于非零。 This could prove to be a headache for such tests if you have a case whose correct answer is zero.如果您有一个正确答案为零的案例,这可能会成为此类测试的头痛问题。

There are lots of ways of comparing two numbers to see if they agree to N significant digits.有很多方法可以比较两个数字以查看它们是否同意 N 个有效数字。 Roughly speaking you just want to make sure that their difference is less than 10^-N times the largest of the two numbers being compared.粗略地说,您只想确保它们的差异小于被比较的两个数字中最大值的 10^-N 倍。 That's easy enough.这很容易。

But, what if one of the numbers is zero?但是,如果其中一个数字为零呢? The whole concept of relative-differences or significant-digits falls down when comparing against zero.当与零进行比较时,相对差异或有效数字的整个概念就会下降。 To handle that case you need to have an absolute-difference as well, which should be specified differently from the relative-difference.要处理这种情况,您还需要有一个绝对差,它的指定方式应与相对差不同。

I discuss the problems of comparing floating-point numbers -- including a specific case of handling zero -- in this blog post:我在这篇博文中讨论了比较浮点数的问题——包括处理零的特定情况:

http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM