如何使用Math.ulp（double）計算Java中一組算術運算的總浮點舍入誤差？

Question

我想使用Java中的Math.ulp（double）方法來計算一系列加法，乘法和除法的浮點舍入誤差。 根據最后位置單位（ULP）上的Wiki頁面，似乎從一個浮點計算得出的錯誤（例如2 + 3或2 * 3）將是0.5 * ulp（2 + 3）或0.5 * ulp（ 2 * 3），其中2 * 3和2 + 3是浮點計算。 但是，將這些錯誤加起來並不能解決我在最終產品中遇到的實際錯誤。 例如說最大錯誤為2 + 3 * 4 = 0.5 * ulp（2+ [3 * 4]）+ 0.5 * ulp（3 * 4）似乎並不能說明我得到的實際錯誤。 因此，我很困惑，也許是我誤解了Math.ulp（double），或者也許我需要使用某種相對誤差。 我不知道。 誰能給我解釋一下，也許舉幾個浮點數和精確數字相加，相乘和相除的例子？ 將不勝感激。

我正在嘗試為Matrix類計算矩陣的簡化行梯形形式，並且我需要知道在經過幾次計算后，我用於計算的二維數組中的某些項是否等於0。如果一行全為零，則退出代碼。 如果其中有一個非零數字，則將其除以該數字，然后執行高斯消除。 問題在於，在執行了一系列操作之后，浮點錯誤可能會蔓延進來，並且計算結果應導致零最終成為非零數字，從而使我的矩陣計算混亂。 因此，我試圖將高斯消除發生的條件從零更改為小於計算出的誤差界限，並且我基於對矩陣項中所有項的計算結果來計算矩陣中每個項的誤差界限，並將它們加在一起新的錯誤數組。 這是我的代碼：

/**
 * Finds the reduced row echelon form of the matrix using partial pivoting
 * @return rref: The reduced row echelon form of the matrix
 */
public Matrix rref()
{
    //ref()
    Matrix ref = copy();
    int iPivot = 0, jPivot = 0, greatestPivotRow;
    double[][] errorArray = new double[height][width];
    while(iPivot < height && jPivot < width)
    {
        do
        {
            //Finds row with greatest absolute-value-of-a-number at the horizontal value of the pivot position
            greatestPivotRow = iPivot;
            for(int n = iPivot; n < height; n++)
            {
                if(Math.abs(ref.getVal(n, jPivot)) > Math.abs(ref.getVal(greatestPivotRow, jPivot)))
                    greatestPivotRow = n;
            }
            //Swaps row at pivot with that row if that number is not 0 (Or less than the floating-point error)
            //If the largest number is 0, all numbers below in the column are 0, so jPivot increments and row swapper is repeated
            if(Math.abs(ref.getVal(greatestPivotRow, jPivot)) > errorArray[greatestPivotRow][jPivot])
                ref = ref.swapRows(iPivot, greatestPivotRow);
            else
                jPivot++;
        }
        while(jPivot < width && Math.abs(ref.getVal(greatestPivotRow, jPivot)) <= errorArray[greatestPivotRow][jPivot]); 
        if(jPivot < width)
        {
            //Pivot value becomes 1
            double rowMultiplier1 = 1/ref.getVal(iPivot,jPivot);
            for(int j = jPivot; j < width; j++)
            {
                ref.matrixArray[iPivot][j] = ref.getVal(iPivot,j) * rowMultiplier1;
                errorArray[iPivot][j] += 0.5 * (Math.ulp(ref.matrixArray[iPivot][j]) + Math.ulp(rowMultiplier1));
            }
            //1st value in nth row becomes 0
            for(int iTarget = iPivot + 1; iTarget < height; iTarget++)
            {
                double rowMultiplier0 = -ref.getVal(iTarget, jPivot)/ref.getVal(iPivot, jPivot);
                for(int j = jPivot; j < width; j++)
                {
                    errorArray[iTarget][j] += 0.5 * (Math.ulp(ref.getVal(iPivot, j) * rowMultiplier0) + Math.ulp(ref.getVal(iTarget, j)
                            + ref.getVal(iPivot, j)*rowMultiplier0) + Math.ulp(rowMultiplier0));
                    ref.matrixArray[iTarget][j] = ref.getVal(iTarget, j)
                            + ref.getVal(iPivot, j)*rowMultiplier0;
                }
            }
        }
        //Shifts pivot down 1 and to the right 1
        iPivot++;
        jPivot++;
    }

    //rref
    Matrix rref = ref.copy();
    iPivot = 1;
    jPivot = 1;
    //Moves pivot along the diagonal
    while(iPivot < height && jPivot < width)
    {
        //Moves horizontal position of pivot to first nonzero number in the row (the 1)
        int m = jPivot;
        while(m < width && Math.abs(rref.getVal(iPivot, m)) < errorArray[iPivot][m])
            m++;
        if(m != width)
        {
            jPivot = m;
            //1st value in rows above pivot become 0
            for(int iTarget = 0; iTarget < iPivot; iTarget++)
            {
                double rowMultiplier = -rref.getVal(iTarget, jPivot)/rref.getVal(iPivot, jPivot);
                for(int j = jPivot; j < width; j++)
                {
                    errorArray[iTarget][j] += 0.5 * (Math.ulp(rref.getVal(iTarget, j) * rowMultiplier) + Math.ulp(rref.getVal(iTarget, j)
                            + rref.getVal(iPivot, j)*rowMultiplier) + Math.ulp(rowMultiplier));
                    rref.matrixArray[iTarget][j] = rref.getVal(iTarget, j)
                            + rref.getVal(iPivot, j)*rowMultiplier;
                }
            }
        }
        iPivot++;
        jPivot++;
    }
    //Get rid of floating-point errors in integers
    for(int i = 0; i < height; i++)
    {
        for(int j =0; j < width; j++)
        {
            if(Math.abs(rref.getVal(i, j) - (int)(rref.getVal(i, j) + 0.5)) <= errorArray[i][j])
                rref.matrixArray[i][j] = (int)(rref.getVal(i, j) + 0.5);
        }
    }
    return rref;
}

代碼的最后一部分，將小於計算的誤差的浮點數從整數轉換為該整數，主要是為了告訴我我的誤差公式是否有效，因為我要計算的某些矩陣最終會得出結果，而不是整數，例如5.000000000000004s等。 因此，我知道如果我有一個非常接近整數但不是整數的數字，我也知道我的錯誤范圍不夠大，而且顯然它們還不夠大，所以我認為我做錯了什么。

我的輸入矩陣是帶有實例變量的矩陣

double[][] matrixArray = {{1,-2,0,0,3}, {2,-5,-3,-2,6}, {0,5,15,10,0}, {2,6,18,8,6}};

我的結果是數組

[[1.0, 0.0, 0.0, -2.0000000000000013, 3.0], [0.0, 1.0, 0.0, -1.0000000000000004, 0.0], [0.0, 0.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]]

盡管我的錯誤計算解決了將零變成1然后用於高斯消除的問題，但是我仍然擁有不是整數的數字，所以我知道我的錯誤范圍是不准確的。 在這種情況下，它可能會起作用，但如果沒有正確的錯誤范圍，則可能無法在下一個情況下起作用。

Answer 1

2 + 3 * 4 = 0.5 * ulp（2+ [3 * 4]）+ 0.5 * ulp（3 * 4）

錯誤復合。 像利息一樣，最終誤差也會成倍增長。 您的示例中的操作是准確的，因此很難看到您在抱怨什么（確定您確實得到了14？）。 您是否考慮到表示誤差，該誤差導致計算中涉及的常數不是數學值，而是它們的0.5ULP近似值？

除了以必要的精度靜態計算時誤差的指數增長外，還有一個問題，您正在使用不精確的浮點數學來計算誤差：

errorArray[iTarget][j] += 0.5 * (Math.ulp(rref.getVal(iTarget, j) * rowMultiplier) + Math.ulp(rref.getVal(iTarget, j)

實際誤差可能超出此語句所計算的范圍，因為沒有什么可以防止浮點加法成為數學結果的較低近似值（乘法可能恰好是精確的，因為在每種情況下被乘數之一是二的冪）。

在另一種編程語言中，您可以將舍入模式更改為“向上”以進行此計算，但是Java不提供對此功能的訪問。

以下是一些切線相關的說明：

當數學上期望的結果是整數時，獲取該整數的雙精度數的常用方法是確保整個計算的1ULP誤差。 您幾乎永遠都不會為涉及多個操作的計算獲得1ULP界限，除非您采取特殊措施確保這一界限（例如Dekker乘法）。

Java可以使用常量並以十六進制格式打印結果，如果要確切查看正在發生的情況，則應使用該常量。

如果您有興趣在特定計算中獲得最終誤差的上限（而不是在所有計算中都是靜態的），則間隔算術比將誤差表征為單個絕對值要精確得多，並且所需的思考要少得多。 在通過其他方式得知結果必須為整數的情況下，如果結果間隔僅包含一個整數，則可以肯定這是唯一可能的答案。

Answer 2

如果您對計算高斯消除過程的誤差范圍感興趣，那么這是一個非常復雜的問題。 例如，本文給出了誤差上限的公式： Higham NJ，Higham DJ。 高斯消除中樞的大增長因素。 SIAM矩陣分析和應用雜志。 1989； 10（2）：155。

公式為：

這絕非易事！

另一方面，如果您的目標是防止蠕變浮點錯誤破壞您的零，我認為您甚至不需要創建errorArray [] [] 。 您可以通過計算浮點數然后通過Math.ulp（）或機器epsilon設置精度條件來做得很好。 這樣，您將不需要最終循環來“擺脫”那些討厭的零。

您還可以使用Java的BigDecimal查看是否獲得更好的結果。 也許這個問題及其給出的答案會有所幫助。

如何使用Math.ulp（double）計算Java中一組算術運算的總浮點舍入誤差？

問題描述

2 個解決方案

解決方案1
0 2015-12-27 19:40:10

解決方案2
0 2015-12-28 20:21:09

如何使用Math.ulp（double）計算Java中一組算術運算的總浮點舍入誤差？

問題描述

2 個解決方案

解決方案1 0 2015-12-27 19:40:10

解決方案2 0 2015-12-28 20:21:09

解決方案1
0 2015-12-27 19:40:10

解決方案2
0 2015-12-28 20:21:09