實現基於表查找的三角函數

Question

對於我在業余時間實現的視頻游戲，我嘗試使用查找表實現我自己的 sinf()、cosf() 和 atan2f() 版本。 目的是使實現速度更快，但准確性較低。

我的初始實現如下。 這些函數工作，並返回良好的近似值。 唯一的問題是它們比調用標准 sinf()、cosf() 和 atan2f() 函數要慢。

那么，我做錯了什么？

// Geometry.h includes definitions of PI, TWO_PI, etc., as
// well as the prototypes for the public functions
#include "Geometry.h"

namespace {
    // Number of entries in the sin/cos lookup table
    const int SinTableCount = 512;

    // Angle covered by each table entry
    const float SinTableDelta = TWO_PI / (float)SinTableCount;

    // Lookup table for Sin() results
    float SinTable[SinTableCount];

    // This object initializes the contents of the SinTable array exactly once
    class SinTableInitializer {
    public:
        SinTableInitializer() {
            for (int i = 0; i < SinTableCount; ++i) {
                SinTable[i] = sinf((float)i * SinTableDelta);
            }
        }
    };
    static SinTableInitializer sinTableInitializer;

    // Number of entries in the atan lookup table
    const int AtanTableCount = 512;

    // Interval covered by each Atan table entry
    const float AtanTableDelta = 1.0f / (float)AtanTableCount;

    // Lookup table for Atan() results
    float AtanTable[AtanTableCount];

    // This object initializes the contents of the AtanTable array exactly once
    class AtanTableInitializer {
    public:
        AtanTableInitializer() {
            for (int i = 0; i < AtanTableCount; ++i) {
                AtanTable[i] = atanf((float)i * AtanTableDelta);
            }
        }
    };
    static AtanTableInitializer atanTableInitializer;

    // Lookup result in table.
    // Preconditions: y > 0, x > 0, y < x
    static float AtanLookup2(float y, float x) {
        assert(y > 0.0f);
        assert(x > 0.0f);
        assert(y < x);

        const float ratio = y / x;
        const int index = (int)(ratio / AtanTableDelta);
        return AtanTable[index];    
    }

}

float Sin(float angle) {
    // If angle is negative, reflect around X-axis and negate result
    bool mustNegateResult = false;
    if (angle < 0.0f) {
        mustNegateResult = true;
        angle = -angle;
    }

    // Normalize angle so that it is in the interval (0.0, PI)
    while (angle >= TWO_PI) {
        angle -= TWO_PI;
    }

    const int index = (int)(angle / SinTableDelta);
    const float result = SinTable[index];

    return mustNegateResult? (-result) : result;
}

float Cos(float angle) {
    return Sin(angle + PI_2);
}

float Atan2(float y, float x) {
    // Handle x == 0 or x == -0
    // (See atan2(3) for specification of sign-bit handling.)
    if (x == 0.0f) {
        if (y > 0.0f) {
            return PI_2;
        }
        else if (y < 0.0f) {
            return -PI_2;
        }
        else if (signbit(x)) {
            return signbit(y)? -PI : PI;
        }
        else {
            return signbit(y)? -0.0f : 0.0f;
        }
    }

    // Handle y == 0, x != 0
    if (y == 0.0f) {
        return (x > 0.0f)? 0.0f : PI;
    }

    // Handle y == x
    if (y == x) {
        return (x > 0.0f)? PI_4 : -(3.0f * PI_4);
    }

    // Handle y == -x
    if (y == -x) {
        return (x > 0.0f)? -PI_4 : (3.0f * PI_4);
    }

    // For other cases, determine quadrant and do appropriate lookup and calculation
    bool right = (x > 0.0f);
    bool top = (y > 0.0f);
    if (right && top) {
        // First quadrant
        if (y < x) {
            return AtanLookup2(y, x);
        }
        else {
            return PI_2 - AtanLookup2(x, y);
        }
    }
    else if (!right && top) {
        // Second quadrant
        const float posx = fabsf(x);
        if (y < posx) {
            return PI - AtanLookup2(y, posx);
        }
        else {
            return PI_2 + AtanLookup2(posx, y);
        }
    }
    else if (!right && !top) {
        // Third quadrant
        const float posx = fabsf(x);
        const float posy = fabsf(y);
        if (posy < posx) {
            return -PI + AtanLookup2(posy, posx);
        }
        else {
            return -PI_2 - AtanLookup2(posx, posy);
        }
    }
    else { // right && !top
        // Fourth quadrant
        const float posy = fabsf(y);
        if (posy < x) {
            return -AtanLookup2(posy, x);
        }
        else {
            return -PI_2 + AtanLookup2(x, posy);
        }
    }

    return 0.0f;
}

Answer 1

“過早的優化是萬惡之源”——Donald Knuth

如今，編譯器為三角函數提供了非常有效的內在函數，這些函數可以從現代處理器（SSE 等）中獲得最佳效果，這解釋了為什么你很難擊敗內置函數。 不要在這些部分上浪費太多時間，而是專注於您可以使用分析器發現的真正瓶頸。

Answer 2

請記住，您有一個協處理器...如果是 1993 年，您會看到速度有所提高...但是今天您將難以擊敗本機內在函數。

嘗試查看 disassebly to sinf。

Answer 3

有人已經對此進行了基准測試，看起來Trig.Math函數似乎已經過優化，並且比您可以提出的任何查找表都要快：

http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html

（他們沒有在頁面上使用錨點，所以你必須向下滾動大約 1/3）

Answer 4

我很擔心這個地方：

// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
    angle -= TWO_PI;
}

但是您可以：為所有功能添加時間計，編寫特殊的性能測試，運行性能測試，打印時間測試報告。我想你會在這些測試之后知道答案。

您也可以使用一些分析工具，例如 AQTime。

Answer 5

內置功能已經很好地優化了，所以要打敗它們真的很難。 就個人而言，我會在其他地方尋找獲得表現的地方。

也就是說，我可以在您的代碼中看到一項優化：

// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
    angle -= TWO_PI;
}

可以替換為：

angle = fmod(angle, TWO_PI);

實現基於表查找的三角函數

問題描述

5 個解決方案

解決方案1
9 已采納 2009-03-16 15:12:20

解決方案2
3 2009-03-16 16:04:57

解決方案3
2 2009-03-16 15:15:08

解決方案4
0 2009-03-16 15:11:34

解決方案5
0 2009-03-16 15:18:55

實現基於表查找的三角函數

問題描述

5 個解決方案

解決方案1 9 已采納 2009-03-16 15:12:20

解決方案2 3 2009-03-16 16:04:57

解決方案3 2 2009-03-16 15:15:08

解決方案4 0 2009-03-16 15:11:34

解決方案5 0 2009-03-16 15:18:55

解決方案1
9 已采納 2009-03-16 15:12:20

解決方案2
3 2009-03-16 16:04:57

解決方案3
2 2009-03-16 15:15:08

解決方案4
0 2009-03-16 15:11:34

解決方案5
0 2009-03-16 15:18:55