简体   繁体   English

C中的浮点运算出错

[英]Error in Floating point operation in C

Please let me know the difference between the following C functions. 请让我知道以下C函数之间的区别。

static int mandel(float c_re, float c_im, int count) {
    float z_re = c_re, z_im = c_im;
    int i;
    for (i = 0; i < count; ++i) {
        if (z_re * z_re + z_im * z_im > 4.f)
            break;

        float new_re = z_re*z_re - z_im*z_im;
        float new_im = 2.f * z_re * z_im;
        z_re = c_re + new_re;
        z_im = c_im + new_im;
    }

    return i;
}

And the following 以下

static int mandel(float c_re, float c_im, int count) {
    float z_re = c_re, z_im = c_im;
    int i;
    for (i = 0; i < count; ++i) {
        if (z_re * z_re + z_im * z_im > 4.f)
            break;

        float new_im = 2.f * z_re * z_im;
        z_re = c_re + z_re*z_re - z_im*z_im;//I have combined the statements here and removed float new_re
        z_im = c_im + new_im;
    }

    return i;
}

Please see my comments for the change in code.The function gives different values for some inputs. 请参阅我对代码更改的注释。函数为某些输入提供不同的值。 Is the float getting erred off due to combining the two statements? 由于两个陈述相结合,浮动是否会被误解?

In a mathematics the two statements would be equivalent. 在数学中,这两个陈述是等价的。 However in computer hardware they may not be. 但是在计算机硬件中它们可能不是。

You could be getting round off error because the initial result (new_re) is rounded and then added to c_re . 您可能会出错,因为初始结果(new_re)已四舍五入,然后添加到c_re。

As Niklas mention: 正如尼克拉斯所说:

intermediate values are stored with higher precision 中间值以更高的精度存储

so the result of new_re may lose some floating points when stored to new_re, but if the intermediate values are added to c_re then a small value of c_re combined with lower significant values of new_re calculation may contribute to the end result. 因此,当存储到new_re时,new_re的结果可能会丢失一些浮点,但如果将中间值添加到c_re,则较小的c_re值与new_re计算的较低有效值组合可能会导致最终结果。

When evaluating a math expression the code generated by a C or C++ compiler is allowed to keep intermediate results with an higher precision. 在计算数学表达式时,允许C或C ++编译器生成的代码以更高的精度保持中间结果。

For example on x86 computers C and C++ double values are normally 64-bits IEEE754 floating point numbers, but the math processor stack uses 80 bits per value when doing the computations. 例如,在x86计算机上,C和C ++ double值通常是64位IEEE754浮点数,但数学处理器堆栈在进行计算时每个值使用80位。

This means that the exact result of a computation will depends on where a temporary is stored in memory and where it was instead kept on the fp stack. 这意味着计算的确切结果将取决于临时存储在内存中的位置以及将其保存在fp堆栈中的位置。 Normally this is not a problem because the precision of temporaries is higher than the precision of stored values... but this is not always true because the computation may have been designed exactly around the floating point expected rounding rules. 通常这不是问题,因为临时值的精度高于存储值的精度......但这并不总是正确的,因为计算可能是围绕浮点预期舍入规则设计的。

Note also that compilers provide special flags to ask to be strict about the math evaluation or to allow them to be very liberal to help optimizations (including ignoring storing operations into local variables or rewriting operations to theoretical math equivalent versions). 另请注意,编译器提供特殊标志以要求对数学评估严格要求或允许它们非常自由地帮助优化(包括忽略将操作存储到局部变量或将操作重写为理论数学等效版本)。 The default today is often to be somewhat liberal and not very strict because that impairs performance. 今天的默认通常是有点自由,而不是非常严格,因为这会损害性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM