简体   繁体   中英

Print bits of float number does not work, can you explain me why?

I want to print the bits of a float number.

My plan was to:

  • Discover how many bits there are at all
  • Create and int array and insert into it the bits of the number using simple right bits shifting.
  • Because the bits were inserted in a reversed order to what I need, lets run another loop to go over the array one again but this time, from the last element to the first one.

Why ?

Lets take for example the number 4.2: I believe the bits were entered in the following way:

4 - 1000

2 - 0010

So together it's 10000010.

Enters FILO - First in -> Last out. So 0 will be the first element but as you cans see here, we need it in the end.

Here is my code:

float FloatAnalysis(float number)
{
    int arr[32] = {0};
    float num_cpy = number;
    size_t num_of_bits = 0, i = 0;
    
    while (0 != number)
    {
        num_cpy >>= 1;
        ++num_of_bits;
    }
    
    num_cpy = number;
    
    for(i = 0; i < num_of_bits; ++i)
    {
    
        arr[i] = (num_cpy & 1);
        num_cpy >>= 1;
    
    }
    
    
    for(i = num_of_bits-1; i => 0; --i)
    {
    
        printf("%d", arr[i]);
    
    }
    
}

And here the output:

bitwise.c:359:11: error: invalid operands to binary >> (have ‘float’ and ‘int’)
  359 |   num_cpy >>= 1;
      |           ^~~
bitwise.c:368:21: error: invalid operands to binary & (have ‘float’ and ‘int’)
  368 |   arr[i] = (num_cpy & 1);
      |                     ^
bitwise.c:369:11: error: invalid operands to binary >> (have ‘float’ and ‘int’)
  369 |   num_cpy >>= 1;
      |           ^~~

Can you expl

ain me what is going on here?

Use memcpy

You cannot perform bitwise operations on a float.

You can use memcpy to copy your float to an unsigned int and preserves its bits:

float num_cpy = number;

becomes

unsigned int num_cpy;
memcpy(&num_cpy, &number, sizeof(unsigned)); 

Note that if you try to cast the result, by taking your float address in memory and cast it as unsigned, with:

num_cpy = *(float *)&number; 

You will strip the floating point part away, you will preserve the value (or what can be preserved) but loose the accuracy of its binary representation.


Example

In the below example,

float number = 42.42;
unsigned int num_cpy;
memcpy(&num_cpy, &number, sizeof(unsigned)); 
unsigned int num_cpy2 = *(float *)&number;
printf("Bits in num_cpy: %d    bits in num_cpy2: %d\n", __builtin_popcount(num_cpy), __builtin_popcount(num_cpy2));
printf("%d\n", num_cpy);
printf("%d\n", num_cpy2);

will output

Bits in num_cpy: 12    bits in num_cpy2: 3
1110027796 // memcpy
42 // cast

More reading

I recommend that you especially take a look at floating point internal representation that sums up very well what is going at the bits level.

Internal Representation: sign: 1 bit, exponent: 8 bits, fraction: 23 bits
(for a single precision, 32 bits floating point, that we call float in C)

OP's code has various problems aside from compiler error.

i => 0 is not proprer code. Perhaps OP wanted i >= 0 ?. Even that has trouble.

size_t num_of_bits = 0, i = 0;
...
//  Bug: i => 0 is always true as `i` is an unsigned type.
for(i = num_of_bits-1; i >= 0; --i)  {
    printf("%d", arr[i]);
}

OP's repaired code.

float FloatAnalysis(float number) {
  assert(sizeof(float) == sizeof(unsigned));
  int arr[32] = {0};

  //float num_cpy = number;
  unsigned num_cpy;
  memcpy(&num_cpy, &number, sizeof num_cpy);  // copy the bit pattern to an unsigned

  // size_t num_of_bits = 0, i = 0;
  size_t num_of_bits = 32, i = 0;  // Always print 32 bits

  //while (0 != number) {
  //  num_cpy >>= 1;
  //  ++num_of_bits;
  //}
  //num_cpy = number;

  for (i = 0; i < num_of_bits; ++i) {
    arr[i] = (num_cpy & 1);
    num_cpy >>= 1;
  }

  // for(i = num_of_bits-1; i => 0; --i)
  for (i = num_of_bits; i-- > 0; ) { // Change test condition
    printf("%d", arr[i]);
  }
  printf("\n");

  // Some more output
  //      12345678901234567890123456789012
  printf("sEeeeeeeeMmmmmmmmmmmmmmmmmmmmmmm\n");
  printf("%a\n", number);
  return number;
}

int main() {
  FloatAnalysis(4.2f);
}

Output

01000000100001100110011001100110
sEeeeeeeeMmmmmmmmmmmmmmmmmmmmmmm
0x1.0cccccp+2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM