简体   繁体   English

C:将实数转换为64位浮点二进制

[英]C: convert a real number to 64 bit floating point binary

I'm trying to write a code that converts a real number to a 64 bit floating point binary. 我正在尝试编写将实数转换为64位浮点二进制数的代码。 In order to do this, the user inputs a real number (for example, 547.4242) and the program must output a 64 bit floating point binary. 为此,用户输入一个实数(例如547.4242),并且程序必须输出64位浮点二进制。

My ideas: 我的想法:

  • The sign part is easy. 标志部分很容易。
  • The program converts the integer part (547 for the previous example) and stores the result in an int variable. 该程序转换整数部分(前面的示例为547),并将结果存储在int变量中。 Then, the program converts the fractional part (.4242 for the previous example) and stores the result into an array (each position of the array stores '1' or '0'). 然后,程序将转换小数部分(对于前面的示例为0.4242),并将结果存储到数组中(数组的每个位置都存储“ 1”或“ 0”)。

This is where I'm stuck. 这就是我卡住的地方。 Summarizing, I have: "Integer part = 1000100011" (type int) and "Fractional part = 0110110010011000010111110000011011110110100101000100" (array). 总而言之,我有:“整数部分= 1000100011”(类型为int)和“分数部分= 0110110010011000010110111110000011011110110100101000100”(数组)。

How can I proceed? 我该如何进行?

the following code is used to determine internal representation of a floating point number according to the IEEE754 notation. 以下代码用于根据IEEE754表示法确定浮点数的内部表示。 This code is made in Turbo c++ ide but you can easily convert for a generalised ide. 这段代码是在Turbo c ++ ide中制作的,但是您可以轻松地转换为广义ide。

#include<conio.h>
#include<stdio.h>

void decimal_to_binary(unsigned char);

union u
{
    float f;
    char c;
};

int main()
{
    int i;
    char*ptr;
    union u a;

    clrscr();
    printf("ENTER THE FLOATING POINT NUMBER : \n");
    scanf("%f",&a.f);

    ptr=&a.c+sizeof(float);

    for(i=0;i<sizeof(float);i++)
    {
        ptr--;
        decimal_to_binary(*ptr);
    }

    getch();
    return 0;
}

void decimal_to_binary(unsigned char n)
{
    int arr[8];
    int i;
    //printf("n = %u  ",n);

    for(i=7;i>=0;i--)
    {
        if(n%2==0)
            arr[i]=0;
        else
            arr[i]=1;
        n/=2;
    }

    for(i=0;i<8;i++)
        printf("%d",arr[i]);
    printf(" ");
}

For further details visit Click here ! 有关更多详细信息,请单击此处

The trick is to treat the value as an integer, so read your 547.4242 as an unsigned long long (ie 64-bits or more), ie 5474242 , counting the number of digits after the '.', in this case 4. Now you have a value which is 10^4 bigger than it should be. 诀窍是将值视为整数,因此将547.4242读为无符号长整型(即64位或更多),即5474242 ,计算“。”后的位数,在这种情况下为4。其值比应有的值大10 ^ 4。 So you float the 5474242 (as a double, or long double) and divide by 10^4. 因此,您将5474242 (作为双倍或长双倍)浮动并除以10 ^ 4。

Decimal to binary conversion is deceptively simple. 从十进制到二进制的转换看似简单。 When you have more bits than the float will hold, then it will have to round. 当您的位数超过浮点数的容纳量时,则必须舍入。 More fun occurs when you have more digits than a 64-bit integer will hold -- noting that trailing zeros are special -- and you have to decide whether to round or not (and what rounding occurs when you float). 当位数多于64位整数所能容纳的位数时,会带来更多的乐趣-注意尾随的零是特殊的-并且您必须决定是否舍入(浮点时会发生舍入)。 Then there's dealing with an E+/-99. 然后处理E +/- 99。 Then when you do the eventual division (or multiplication) by 10^n, you have (a) another potential rounding, and (b) the issue that large 10^n are not exactly represented in your floating point -- which is another source of error. 然后,当您将最终的10 ^ n除(或乘)后,便会出现(a)另一个可能的舍入,以及(b)大10 ^ n无法在浮点数中精确表示的问题-这是另一个来源错误。 (And for E+/-99 forms, you may need upto and a little beyond 10^300 for the final step.) (对于E +/- 99表格,最后一步可能需要达到10 ^ 300,甚至略有超出。)

Enjoy ! 请享用 !

In order to correctly round all possible decimal representations to the nearest double , you need big integers. 为了将所有可能的十进制表示形式正确舍入到最接近的double ,您需要大整数。 Using only the basic integer types from C will leave you to re-implement big integer arithmetics. 仅使用C语言中的基本整数类型将使您可以重新实现大整数算法。 Each of these two approaches is possible, more information about each follows: 这两种方法中的每一种都是可能的,有关每种方法的更多信息如下:

  1. For the first approach, you need a big integer library: GMP is a good one. 对于第一种方法,您需要一个大的整数库: GMP是一个很好的库。 Armed with such a big integer library, you tackle an input such as the example 123.456E78 as the integer 123456 * 10 75 and start wondering what values M in [2 53 … 2 54 ) and P in [-1022 … 1023] make (M / 2 53 ) * 2 P closest to this number. 有了这么大的整数库,您可以处理诸如123.456E78这样的输入,例如整数123456 * 10 75,然后开始怀疑[2 53 …2 54 ]中的M和[-1022…1023]中的P是什么值( M / 2 53 )* 2 P最接近此数字。 This question can be answered with big integer operations, following the steps described in this blog post (summary: first determine P. Then use a division to compute M). 可以按照此博客文章中描述的步骤使用大整数运算来回答此问题(摘要:首先确定P。然后使用除法计算M)。 A complete implementation must take care of subnormal numbers and infinities ( inf is the correct result to return for any decimal representation of a number that would have an exponent larger than +1023). 完整的实现必须照顾到次正规的数字和无穷大(对于任何指数大于+1023的数字的十进制表示, inf都是正确的结果)。

  2. The second approach, if you do not want to include or implement a full general-purpose big integer library, still requires a few basic operations to be implemented on arrays of C integers representing large numbers. 第二种方法,如果您不想包括或实现一个完整的通用大整数库,仍然需要在代表大数的C整数数组上实现一些基本操作。 The function decfloat() in this implementation represents large numbers in base 10 9 because that simplifies the conversion from the initial decimal representation to the internal representation as an array x of uint32_t . 功能decfloat()在本实施表示在基座10 9大量因为它简化了从最初的十进制表示的内部表示转换为阵列xuint32_t

Following is a basic conversion. 以下是基本转换。 Enough to get OP started. 足以启动OP。

OP's "integer part of real number" --> int is far too limiting. OP的“实数的整数部分”-> int过于局限。 Better to simply convert the entire string to a large integer like uintmax_t . 最好简单地将整个字符串转换为大整数,如uintmax_t Note the decimal point '.' 注意小数点'.' and account for overflow while scanning. 并考虑扫描时的溢出。

This code does not handle exponents nor negative numbers. 此代码不处理指数或负数。 It may be off in the the last bit or so due to limited integer ui or the the final num = ui * pow10(expo) . 由于有限的整数ui或最后的num = ui * pow10(expo) ,最后一位可能会关闭。 It handles most overflow cases. 它处理大多数溢出情况。

#include <inttypes.h>

double my_atof(const char *src) {
  uintmax_t ui = 0;
  int dp = '.';
  size_t dpi;
  size_t i = 0;
  size_t toobig = 0;
  int ch;
  for (i = 0; (ch = (unsigned char) src[i]) != '\0'; i++) {
    if (ch == dp) {
      dp = '\0';  // only get 1 dp
      dpi = i;
      continue;
    }
    if (!isdigit(ch)) {
      break; // illegal character
    }
    ch -= '0';
    // detect overflow
    if (toobig || 
        (ui >= UINTMAX_MAX / 10 && 
        (ui > UINTMAX_MAX / 10 || ch > UINTMAX_MAX % 10))) {
      toobig++;
      continue;
    }
    ui = ui * 10 + ch;
  }
  intmax_t expo = toobig;
  if (dp == '\0') {
    expo -= i - dpi - 1;
  }

  double num;
  if (expo < 0) {
    // slightly more precise than: num = ui * pow10(expo);
    num = ui / pow10(-expo);
  } else {
    num = ui * pow10(expo);
  }
  return num;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM