简体   繁体   English

缩放互补误差函数的精确计算,erfcx()

[英]Accurate computation of scaled complementary error function, erfcx()

The (exponentially) scaled complementary error function, commonly designated by erfcx , is defined mathematically as erfcx(x) := e x 2 erfc(x).通常由erfcx指定的(指数)缩放互补误差函数在数学上定义为 erfcx(x) := e x 2 erfc(x)。 It frequently occurs in diffusion problems in physics as well as chemistry.它经常出现在物理和化学中的扩散问题中。 While some mathematical environments, such as MATLAB and GNU Octave , provide this function, it is absent from the C standard math library, which only provides erf() and erfc() .虽然某些数学环境(例如MATLABGNU Octave )提供了此函数,但 C 标准数学库中没有它,它只提供erf()erfc()

While it is possible to implement one's own erfcx() based directly on the mathematical definition, this only works over a limited input domain, because in the positive half-plane erfc() underflows for arguments of moderate magnitude, while exp() overflows, as noted in this question , for example.虽然可以直接根据数学定义实现自己的erfcx() ,但这仅适用于有限的输入域,因为在正半平面erfc()对于中等幅度的参数会下溢,而exp()溢出,例如,如本问题所述

For use with C, one could adapt some erfcx() open-source implementations such as the one in the Faadeeva package , as pointed to in responses to this question .为了与 C 一起使用,可以调整一些erfcx()开源实现,例如Faadeeva 包中的实现,如对此问题的回答所指出的那样。 However, these implementations typically do not provide full accuracy for a given floating-point format.但是,这些实现通常不能为给定的浮点格式提供完全的准确性。 For example, tests with 2 32 test vectors show the maximum error of erfcx() as provided by the Faadeeva package to be 8.41 ulps in the positive half-plane and 511.68 ulps in the negative half-plane.例如,使用 2 32 个测试向量的测试显示 Faadeeva 包提供的erfcx()的最大误差为正半平面中的 8.41 ulps 和负半平面中的 511.68 ulps。

A reasonable bound for an accurate implementation would be 4 ulps, corresponding to the accuracy bound of the math functions in the LA profile of Intel's Vector Math library, which I have found to be a reasonable bound for non-trivial math function implementations that require both good accuracy and good performance.准确实现的合理界限是 4 ulps,对应于英特尔矢量数学库的 LA 配置文件中数学函数的精度界限,我发现这是需要两者的非平凡数学函数实现的合理界限良好的准确性和良好的性能。

How could erfcx() , and the corresponding single-precision version, erfcxf() , be implemented accurately, while using only the C standard math library, and requiring no external libraries? erfcx()和相应的单精度版本erfcxf()如何在仅使用 C 标准数学库且不需要外部库的情况下准确实现? We can assume that C's float nad double types are mapped to IEEE 754-2008 binary32 and binary64 floating-point types.我们可以假设 C 的float nad double类型映射到 IEEE 754-2008 binary32binary64浮点类型。 Hardware support for the fused multiply-add operation (FMA) can be assumed, as this is supported by all major processor architectures at this time.可以假设对融合乘加运算 (FMA) 的硬件支持,因为此时所有主要处理器架构都支持这一点。

The best approach for an erfcx() implementation I have found so far is based on the following paper:到目前为止,我发现的erfcx()实现的最佳方法基于以下论文:

MM Shepherd and JG Laframboise, "Chebyshev Approximation of (1 + 2 x) exp(x 2 ) erfc x in 0 ≤ x < ∞." MM Shepherd 和 JG Laframboise,“(1 + 2 x) exp(x 2 ) erfc x 在 0 ≤ x < ∞ 中的切比雪夫近似。” Mathematics of Computation , Volume 36, No. 153, January 1981, pp. 249-253 (online)计算数学,第 36 卷,第 153 期,1981 年 1 月,第 249-253 页(在线)

The paper proposes clever transformations that map the scaled complementary error function to a tightly bounded helper function that is amenable to straightforward polynomial approximation.该论文提出了巧妙的转换,将缩放的互补误差函数映射到一个紧有界的辅助函数,该函数适合直接的多项式近似。 I have experimented with variations of the transformations for the sake of performance, but all of these have had negative impact on accuracy.为了性能,我尝试了各种变换,但所有这些都对准确性产生了负面影响。 The choice of the constant K in the transformation (x - K) / (x + K) has a non-obvious relationship with the accuracy of the core approximation.变换(x-K)/(x+K)中常数K的选择与核近似的精度有不明显的关系。 I empirically determined "optimal" values, which differ from the paper.我凭经验确定了与论文不同的“最佳”值。

The transformations for the arguments to the core approximation and the intermediate results back into into erfcx results incur additional rounding errors.将参数转换为核心近似值并将中间结果转换回erfcx结果会导致额外的舍入误差。 To mitigate their impact on accuracy, we need to apply compensation steps, which I outlined in some detail in my earlier question & answer regarding erfcf .为了减轻它们对准确性的影响,我们需要应用补偿步骤,我在之前关于erfcf问答中详细概述了这些步骤。 The availability of FMA greatly simplifies this task. FMA 的可用性大大简化了这项任务。

The resulting single-precision code looks as follows:生成的单精度代码如下所示:

/*  
 * Based on: M. M. Shepherd and J. G. Laframboise, "Chebyshev Approximation of 
 * (1+2x)exp(x^2)erfc x in 0 <= x < INF", Mathematics of Computation, Vol. 36,
 * No. 153, January 1981, pp. 249-253.
 *
 */  
float my_erfcxf (float x)
{
    float a, d, e, m, p, q, r, s, t;

    a = fmaxf (x, 0.0f - x); // NaN-preserving absolute value computation

    /* Compute q = (a-2)/(a+2) accurately. [0,INF) -> [-1,1] */
    m = a - 2.0f;
    p = a + 2.0f;
#if FAST_RCP_SSE
    r = fast_recipf_sse (p);
#else
    r = 1.0f / p;
#endif
    q = m * r;
    t = fmaf (q + 1.0f, -2.0f, a); 
    e = fmaf (q, -a, t); 
    q = fmaf (r, e, q); 

    /* Approximate (1+2*a)*exp(a*a)*erfc(a) as p(q)+1 for q in [-1,1] */
    p =              0x1.f10000p-15f;  //  5.92470169e-5
    p = fmaf (p, q,  0x1.521cc6p-13f); //  1.61224554e-4
    p = fmaf (p, q, -0x1.6b4ffep-12f); // -3.46481771e-4
    p = fmaf (p, q, -0x1.6e2a7cp-10f); // -1.39681227e-3
    p = fmaf (p, q,  0x1.3c1d7ep-10f); //  1.20588380e-3
    p = fmaf (p, q,  0x1.1cc236p-07f); //  8.69014394e-3
    p = fmaf (p, q, -0x1.069940p-07f); // -8.01387429e-3
    p = fmaf (p, q, -0x1.bc1b6cp-05f); // -5.42122945e-2
    p = fmaf (p, q,  0x1.4ff8acp-03f); //  1.64048523e-1
    p = fmaf (p, q, -0x1.54081ap-03f); // -1.66031078e-1
    p = fmaf (p, q, -0x1.7bf5cep-04f); // -9.27637145e-2
    p = fmaf (p, q,  0x1.1ba03ap-02f); //  2.76978403e-1

    /* Divide (1+p) by (1+2*a) ==> exp(a*a)*erfc(a) */
    d = a + 0.5f;
#if FAST_RCP_SSE
    r = fast_recipf_sse (d);
#else
    r = 1.0f / d;
#endif
    r = r * 0.5f;
    q = fmaf (p, r, r); // q = (p+1)/(1+2*a)
    t = q + q;
    e = (p - q) + fmaf (t, -a, 1.0f); // residual: (p+1)-q*(1+2*a)
    r = fmaf (e, r, q);

    if (a > 0x1.fffffep127f) r = 0.0f; // 3.40282347e+38 // handle INF argument

    /* Handle negative arguments: erfcx(x) = 2*exp(x*x) - erfcx(|x|) */
    if (x < 0.0f) {
        s = x * x;
        d = fmaf (x, x, -s);
        e = expf (s);
        r = e - r;
        r = fmaf (e, d + d, r); 
        r = r + e;
        if (e > 0x1.fffffep127f) r = e; // 3.40282347e+38 // avoid creating NaN
    }
    return r;
}

The maximum error of this implementation in the negative half-plane will depend on the accuracy of the standard math library's implementation of expf() .此实现在负半平面中的最大误差将取决于标准数学库的expf()实现的准确性。 Using the Intel compiler, version 13.1.3.198, and compiling with /fp:strict I observed a maximum error of 2.00450 ulps in the positive half-plane and 2.38412 ulps in the negative half-plane in an exhaustive test.使用英特尔编译器版本 13.1.3.198 并使用/fp:strict编译,我在详尽测试中观察到正半平面中的最大误差为 2.00450 ulps,负半平面中的最大误差为 2.38412 ulps。 Best I can tell at this time, a faithfully-rounded implementation of expf() will result in a maximum error of less than 2.5 ulps.目前我能说的最好的是, expf()的忠实圆润实现将导致小于 2.5 ulps 的最大误差。

Note that while the code contains two divisions, which are potentially slow operations, they occur in the special form of reciprocals, and are thus amenable to the use of fast reciprocal approximations on many platforms.请注意,虽然代码包含两个可能是缓慢运算的除法,但它们以倒数的特殊形式出现,因此适合在许多平台上使用快速倒数近似。 As long as the reciprocal approximation is faithfully rounded, the impact on erfcxf() accuracy seems to be negligible, based on experiments.根据实验,只要倒数近似值忠实地四舍五入,对erfcxf()准确性的影响似乎可以忽略不计。 Even slightly larger errors, such as in the fast SSE version (with maximum error < 2.0 ulps) appear to have only a minor impact.即使是稍大的错误,例如在快速 SSE 版本(最大错误 < 2.0 ulps)中,似乎也只有很小的影响。

/* Fast reciprocal approximation. HW approximation plus Newton iteration */
float fast_recipf_sse (float a)
{
    __m128 t;
    float e, r;
    t = _mm_set_ss (a);
    t = _mm_rcp_ss (t);
    _mm_store_ss (&r, t);
    e = fmaf (0.0f - a, r, 1.0f);
    r = fmaf (e, r, r);
    return r;
}

Double-precision version erfcx() is structurally identical to the single-precision version erfcxf() , but requires a minimax polynomial approximation with many more terms.双精度版本erfcx()在结构上与单精度版本erfcxf() ,但需要具有更多项的极小极大多项式近似。 This presents a challenge when optimizing the core approximation, as many heuristics will break down when the search space is very large.这在优化核心近似时提出了一个挑战,因为当搜索空间非常大时,许多启发式方法会失效。 The coefficients below represent my best solution to date, and there is definitely room for improvement.下面的系数代表了我迄今为止最好的解决方案,肯定有改进的余地。 Building with the Intel compiler and /fp:strict , and using 2 32 random test vectors, the maximum error observed was 2.83788 ulps in the positive half-plane and 2.77856 ulps in the negative half-plane.使用英特尔编译器和/fp:strict构建,并使用 2 32 个随机测试向量,观察到的最大误差在正半平面中为 2.83788 ulps,在负半平面中为 2.77856 ulps。

double my_erfcx (double x)
{
    double a, d, e, m, p, q, r, s, t;

    a = fmax (x, 0.0 - x); // NaN preserving absolute value computation

    /* Compute q = (a-4)/(a+4) accurately. [0,INF) -> [-1,1] */
    m = a - 4.0;
    p = a + 4.0;
    r = 1.0 / p;
    q = m * r;
    t = fma (q + 1.0, -4.0, a); 
    e = fma (q, -a, t); 
    q = fma (r, e, q); 

    /* Approximate (1+2*a)*exp(a*a)*erfc(a) as p(q)+1 for q in [-1,1] */
    p =             0x1.edcad78fc8044p-31;  //  8.9820305531190140e-10
    p = fma (p, q,  0x1.b1548f14735d1p-30); //  1.5764464777959401e-09
    p = fma (p, q, -0x1.a1ad2e6c4a7a8p-27); // -1.2155985739342269e-08
    p = fma (p, q, -0x1.1985b48f08574p-26); // -1.6386753783877791e-08
    p = fma (p, q,  0x1.c6a8093ac4f83p-24); //  1.0585794011876720e-07
    p = fma (p, q,  0x1.31c2b2b44b731p-24); //  7.1190423171700940e-08
    p = fma (p, q, -0x1.b87373facb29fp-21); // -8.2040389712752056e-07
    p = fma (p, q,  0x1.3fef1358803b7p-22); //  2.9796165315625938e-07
    p = fma (p, q,  0x1.7eec072bb0be3p-18); //  5.7059822144459833e-06
    p = fma (p, q, -0x1.78a680a741c4ap-17); // -1.1225056665965572e-05
    p = fma (p, q, -0x1.9951f39295cf4p-16); // -2.4397380523258482e-05
    p = fma (p, q,  0x1.3be1255ce180bp-13); //  1.5062307184282616e-04
    p = fma (p, q, -0x1.a1df71176b791p-13); // -1.9925728768782324e-04
    p = fma (p, q, -0x1.8d4aaa0099bc8p-11); // -7.5777369791018515e-04
    p = fma (p, q,  0x1.49c673066c831p-8);  //  5.0319701025945277e-03
    p = fma (p, q, -0x1.0962386ea02b7p-6);  // -1.6197733983519948e-02
    p = fma (p, q,  0x1.3079edf465cc3p-5);  //  3.7167515521269866e-02
    p = fma (p, q, -0x1.0fb06dfedc4ccp-4);  // -6.6330365820039094e-02
    p = fma (p, q,  0x1.7fee004e266dfp-4);  //  9.3732834999538536e-02
    p = fma (p, q, -0x1.9ddb23c3e14d2p-4);  // -1.0103906603588378e-01
    p = fma (p, q,  0x1.16ecefcfa4865p-4);  //  6.8097054254651804e-02
    p = fma (p, q,  0x1.f7f5df66fc349p-7);  //  1.5379652102610957e-02
    p = fma (p, q, -0x1.1df1ad154a27fp-3);  // -1.3962111684056208e-01
    p = fma (p, q,  0x1.dd2c8b74febf6p-3);  //  2.3299511862555250e-01

    /* Divide (1+p) by (1+2*a) ==> exp(a*a)*erfc(a) */
    d = a + 0.5;
    r = 1.0 / d;
    r = r * 0.5;
    q = fma (p, r, r); // q = (p+1)/(1+2*a)
    t = q + q;
    e = (p - q) + fma (t, -a, 1.0); // residual: (p+1)-q*(1+2*a)
    r = fma (e, r, q);

    /* Handle argument of infinity */
    if (a > 0x1.fffffffffffffp1023) r = 0.0;

    /* Handle negative arguments: erfcx(x) = 2*exp(x*x) - erfcx(|x|) */
    if (x < 0.0) {
        s = x * x;
        d = fma (x, x, -s);
        e = exp (s);
        r = e - r;
        r = fma (e, d + d, r); 
        r = r + e;
        if (e > 0x1.fffffffffffffp1023) r = e; // avoid creating NaN
    }
    return r;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM