简体   繁体   English

浮点数的确切值作为理性值

[英]Exact value of a floating-point number as a rational

I'm looking for a method to convert the exact value of a floating-point number to a rational quotient of two integers, ie a / b , where b is not larger than a specified maximum denominator b_max . 我正在寻找一种方法将浮点数的精确值转换为两个整数合理商,即a / b ,其中b不大于指定的最大分母b_max If satisfying the condition b <= b_max is impossible, then the result falls back to the best approximation which still satisfies the condition. 如果满足条件b <= b_max是不可能的,则结果回落到仍然满足条件的最佳近似。

Hold on. 坚持,稍等。 There are a lot of questions/answers here about the best rational approximation of a truncated real number which is represented as a floating-point number. 这里有很多关于截断实的最合理逼近的问题/答案,它被表示为浮点数。 However I'm interested in the exact value of a floating-point number, which is itself a rational number with a different representation. 但是我对浮点数的确切值感兴趣,浮点数本身就是一个具有不同表示的有理数。 More specifically, the mathematical set of floating-point numbers is a subset of rational numbers. 更具体地,浮点数的数学集合是有理数的子集。 In case of IEEE 754 binary floating-point standard it is a subset of dyadic rationals . 在IEEE 754二进制浮点标准的情况下,它是二元有理数的子集。 Anyway, any floating-point number can be converted to a rational quotient of two finite precision integers as a / b . 无论如何,任何浮点数都可以转换为两个有限精度整数的有理商作为a / b

So, for example assuming IEEE 754 single-precision binary floating-point format, the rational equivalent of float f = 1.0f / 3.0f is not 1 / 3 , but 11184811 / 33554432 . 因此,例如假设IEEE 754单精度二进制浮点格式, float f = 1.0f / 3.0f的有理等价不是1 / 3 ,而是11184811 / 33554432 This is the exact value of f , which is a number from the mathematical set of IEEE 754 single-precision binary floating-point numbers. 这是f精确值,它是IEEE 754单精度二进制浮点数的数学集合中的数字。

Based on my experience, traversing (by binary search of) the Stern-Brocot tree is not useful here, since that is more suitable for approximating the value of a floating-point number, when it is interpreted as a truncated real instead of an exact rational . 根据我的经验,遍历(通过二分搜索) Stern-Brocot树在这里没有用,因为当它被解释为截断的实数而不是精确数时,它更适合于近似浮点数的值。 理性的

Possibly, continued fractions are the way to go. 可能, 继续分数是要走的路。

The another problem here is integer overflow. 这里的另一个问题是整数溢出。 Think about that we want to represent the rational as the quotient of two int32_t , where the maximum denominator b_max = INT32_MAX . 想想我们想要将理性表示为两个int32_t的商,其中最大分母b_max = INT32_MAX We cannot rely on a stopping criterion like b > b_max . 我们不能依赖像b > b_max这样的停止标准。 So the algorithm must never overflow, or it must detect overflow. 所以算法必须永远不会溢出,否则它必须检测溢出。

What I found so far is an algorithm from Rosetta Code , which is based on continued fractions, but its source mentions it is "still not quite complete". 到目前为止我发现的Rosetta Code的算法 ,该算法基于连续分数,但其来源提到它“仍然不完全”。 Some basic tests gave good results, but I cannot confirm its overall correctness and I think it can easily overflow. 一些基本测试给出了很好的结果,但我无法确认其整体正确性,我认为它很容易溢出。

// https://rosettacode.org/wiki/Convert_decimal_number_to_rational#C

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdint.h>

/* f : number to convert.
 * num, denom: returned parts of the rational.
 * md: max denominator value.  Note that machine floating point number
 *     has a finite resolution (10e-16 ish for 64 bit double), so specifying
 *     a "best match with minimal error" is often wrong, because one can
 *     always just retrieve the significand and return that divided by 
 *     2**52, which is in a sense accurate, but generally not very useful:
 *     1.0/7.0 would be "2573485501354569/18014398509481984", for example.
 */
void rat_approx(double f, int64_t md, int64_t *num, int64_t *denom)
{
    /*  a: continued fraction coefficients. */
    int64_t a, h[3] = { 0, 1, 0 }, k[3] = { 1, 0, 0 };
    int64_t x, d, n = 1;
    int i, neg = 0;

    if (md <= 1) { *denom = 1; *num = (int64_t) f; return; }

    if (f < 0) { neg = 1; f = -f; }

    while (f != floor(f)) { n <<= 1; f *= 2; }
    d = f;

    /* continued fraction and check denominator each step */
    for (i = 0; i < 64; i++) {
        a = n ? d / n : 0;
        if (i && !a) break;

        x = d; d = n; n = x % n;

        x = a;
        if (k[1] * a + k[0] >= md) {
            x = (md - k[0]) / k[1];
            if (x * 2 >= a || k[1] >= md)
                i = 65;
            else
                break;
        }

        h[2] = x * h[1] + h[0]; h[0] = h[1]; h[1] = h[2];
        k[2] = x * k[1] + k[0]; k[0] = k[1]; k[1] = k[2];
    }
    *denom = k[1];
    *num = neg ? -h[1] : h[1];
}

All finite double are rational numbers as OP well stated.. 所有有限double都是有理数,如OP所述。

Use frexp() to break the number into its fraction and exponent. 使用frexp()将数字分解为其分数和指数。 The end result still needs to use double to represent whole number values due to range requirements. 由于范围要求,最终结果仍然需要使用double来表示整数值。 Some numbers are too small, ( x smaller than 1.0/(2.0,DBL_MAX_EXP) ) and infinity, not-a-number are issues. 有些数字太小,( x小于1.0/(2.0,DBL_MAX_EXP) )和无穷大,非数字是问题。

The frexp functions break a floating-point number into a normalized fraction and an integral power of 2. ... interval [1/2, 1) or zero ... frexp函数将一个浮点数分解为一个归一化的分数和一个2的整数幂。... interval [1 / 2,1}或0 ...
C11 §7.12.6.4 2/3 C11§7.12.6.42/ 3

#include <math.h>
#include <float.h>

_Static_assert(FLT_RADIX == 2, "TBD code for non-binary FP");

// Return error flag
int split(double x, double *numerator, double *denominator) {
  if (!isfinite(x)) {
    *numerator = *denominator = 0.0;
    if (x > 0.0) *numerator = 1.0;
    if (x < 0.0) *numerator = -1.0;
    return 1;
  }
  int bdigits = DBL_MANT_DIG;
  int expo;
  *denominator = 1.0;
  *numerator = frexp(x, &expo) * pow(2.0, bdigits);
  expo -= bdigits;
  if (expo > 0) {
    *numerator *= pow(2.0, expo);
  }
  else if (expo < 0) {
    expo = -expo;
    if (expo >= DBL_MAX_EXP-1) {
      *numerator /= pow(2.0, expo - (DBL_MAX_EXP-1));
      *denominator *= pow(2.0, DBL_MAX_EXP-1);
      return fabs(*numerator) < 1.0;
    } else {
      *denominator *= pow(2.0, expo);
    }
  }

  while (*numerator && fmod(*numerator,2) == 0 && fmod(*denominator,2) == 0) {
    *numerator /= 2.0;
    *denominator /= 2.0;
  }
  return 0;
}

void split_test(double x) {
  double numerator, denominator;
  int err = split(x, &numerator, &denominator);
  printf("e:%d x:%24.17g n:%24.17g d:%24.17g q:%24.17g\n", 
      err, x, numerator, denominator, numerator/ denominator);
}

int main(void) {
  volatile float third = 1.0f/3.0f;
  split_test(third);
  split_test(0.0);
  split_test(0.5);
  split_test(1.0);
  split_test(2.0);
  split_test(1.0/7);
  split_test(DBL_TRUE_MIN);
  split_test(DBL_MIN);
  split_test(DBL_MAX);
  return 0;
}

Output 产量

e:0 x:      0.3333333432674408 n:                11184811 d:                33554432 q:      0.3333333432674408
e:0 x:                       0 n:                       0 d:        9007199254740992 q:                       0
e:0 x:                       1 n:                       1 d:                       1 q:                       1
e:0 x:                     0.5 n:                       1 d:                       2 q:                     0.5
e:0 x:                       1 n:                       1 d:                       1 q:                       1
e:0 x:                       2 n:                       2 d:                       1 q:                       2
e:0 x:     0.14285714285714285 n:        2573485501354569 d:       18014398509481984 q:     0.14285714285714285
e:1 x: 4.9406564584124654e-324 n:  4.4408920985006262e-16 d: 8.9884656743115795e+307 q: 4.9406564584124654e-324
e:0 x: 2.2250738585072014e-308 n:                       2 d: 8.9884656743115795e+307 q: 2.2250738585072014e-308
e:0 x: 1.7976931348623157e+308 n: 1.7976931348623157e+308 d:                       1 q: 1.7976931348623157e+308

Leave the b_max consideration for later. b_max考虑b_max


More expedient code is possible with replacing pow(2.0, expo) with ldexp(1, expo) @gammatester or exp2(expo) @Bob__ 使用ldexp(1, expo) @gammatesterexp2(expo) @Bob__替换pow(2.0, expo) ldexp(1, expo)更方便的代码

while (*numerator && fmod(*numerator,2) == 0 && fmod(*denominator,2) == 0) could also use some performance improvements. while (*numerator && fmod(*numerator,2) == 0 && fmod(*denominator,2) == 0)也可以使用一些性能改进。 But first, let us get the functionality as needed. 但首先,让我们根据需要获得功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM