简体   繁体   English

Java和浮点运算

[英]Java and floating point arithmetic

Having the code有代码

public static final float epsilon = 0.00000001f;

public static final float a [] = {
        -180.0f,
        -180.0f + epsilon * 2,
        -epsilon * 2
}

The a is initialized as follows: a初始化如下:

[-180.0, -180.0, -2.0E-8]

Instead of desired而不是想要

[-180.0, X, Y]

How to tune epsilon to achieve the desired result?如何调整epsilon以达到预期的结果? -- ——


1) I want float rather than double to be coherent with the previously written code 1)我希望float而不是double与之前编写的代码保持一致
2) I do not want -179.99999998 or any other particular number for X , I want X > -180.0 but X as much as possible close to -180.0 2)我不想要-179.99999998X任何其他特定数字,我想要X > -180.0X尽可能接近-180.0
3) I want Y to be as much as possible close to 0 , but to be it float 3)我希望Y尽可能接近0 ,但要float
4) I want -180.0 < X < Y 4) 我想要-180.0 < X < Y

In my initial post I have not specified precisely what I want.在我最初的帖子中,我没有具体说明我想要什么。 Patricia Shanahan guessed that by suggesting Math.ulp Patricia Shanahan 通过建议Math.ulp猜到了这Math.ulp

As recommended in prior answers, the best solution is to use double .正如先前答案中所建议的那样,最好的解决方案是使用double However, if you want to work with float , you need to take into account its available precision in the region of interest.但是,如果要使用float ,则需要考虑其在感兴趣区域中的可用精度。 This program replaces your literal epsilon with the value associated with the least significant bit of 180f:该程序将您的文字epsilon替换为与 180f 的最低有效位相关联的值:

import java.util.Arrays;

public class Test {
  public static final float epsilon = Math.ulp(-180f);

  public static final float a [] = {
          -180.0f,
          -180.0f + epsilon * 2,
          -epsilon * 2
  };

  public static void main(String[] args) {
    System.out.println(Arrays.toString(a));
  }

}

Output:输出:

[-180.0, -179.99997, -3.0517578E-5]

Although the value 0.00000001f is within the float 's precision capacity, the value -180f + 0.00000001f * 2 ( -179.99999998 ) is not .尽管值0.00000001ffloat的精度范围内,但值-180f + 0.00000001f * 2 ( -179.99999998 )不在 float has only about 7-8 significant digits of precision, and -179.99999998 requires at least 11. So the least significant bits of it get dropped by the addition operation , and the imprecise value ends up being -180.0f . float只有大约 7-8 个有效位的精度,而-179.99999998至少需要 11 个。所以它的最低有效位被加法运算丢弃,不精确的值最终是-180.0f

Just for the fun of it, here are those values in bits ( n = -180.0f ):只是为了好玩,这里是那些以位为单位的值n = -180.0f ):

sign
           | exponent       significand
           - -------- -----------------------
epsilon  = 0 01100100 01010111100110001110111
epsilon2 = 0 01100101 01010111100110001110111
n        = 1 10000110 01101000000000000000000
result   = 1 10000110 01101000000000000000000

The result ends up being bit-for-bit the same as the original -180.0f .结果最终与原始-180.0f逐位相同。

If you use double , that problem goes away , because you aren't exceeding double 's ~15 digits of precision.如果您使用double ,该问题就会消失,因为您没有超过double的~15 位精度。

Try to "double" key.尝试“双”键。 If it is not enough for you, try "long double".如果这对您来说还不够,请尝试“long double”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM