简体   繁体   中英

How to store bytes of a float value in a string and retrieve the value afterwards?

I'm trying to figure out a way to send a sequence of float values over the network. I've seen various answers for this, and this is my current attempt:

#include <iostream>
#include <cstring>

union floatBytes
{
    float value;
    char bytes[sizeof (float)];
};

int main()
{
    floatBytes data1;
    data1.value = 3.1;
    std::string string(data1.bytes);
    floatBytes data2;
    strncpy(data2.bytes, string.c_str(), sizeof (float));

    std::cout << data2.value << std::endl; // <-- prints "3.1"

    return 0;
}

Which works nicely (though I suspect I might run into problems when sending this string to other systems, please comment).

However, if the float value is a round number (like 3.0 instead of 3.1) then this doesn't work.

data1.value = 3;
std::string string(data1.bytes);
floatBytes data2;
strncpy(data2.bytes, string.c_str(), sizeof (float));

std::cout << data2.value << std::endl; // <-- prints "0"

So what is the preferred way of storing the bytes of a float value, send it, and parse it "back" to a float value?

Never use str* functions this way. These are intended to deal with c-string and the byte representation of a float is certainly not a valid c-string. What you need is to send/receive your data in a common representation. There exist a lot of them, but basically two: a textual representation or a byte coding.

Textual representation) almost consist in converting your float value onto a string using stringstream to convert and then extract the string and send it over the connection.

Byte representation) that is much more problematic because if the two machines are not using the same byte-ordering, float encoding, etc then you can't send the raw byte as-is. But there exists (at least) one standard known as XDR (RFC 4506) that specify a standard to encode bytes of a float/double value natively encoded with IEEE 754.

You can reconstitute a float portably with rather involved code, which I maintain on my IEE754 git hub site. If you break the float into bytes using those functions, and reconstitute using the other function, you will obtain the same value in receiver as you sent, regardless of float encoding, up to the precision of the format.

https://github.com/MalcolmMcLean/ieee754

float freadieee754f(FILE *fp, int bigendian)
{
   unsigned long buff = 0;
   unsigned long buff2 = 0;
   unsigned long mask;
   int sign;
   int exponent;
   int shift;
   int i;
   int significandbits = 23;
   int expbits = 8;
   double fnorm = 0.0;
   double bitval;
   double answer;

   for(i=0;i<4;i++)
     buff = (buff << 8) | fgetc(fp);
   if(!bigendian)
   {
     for(i=0;i<4;i++)
     {
       buff2 <<= 8;
       buff2 |= (buff & 0xFF);
       buff >>= 8;
     }
     buff = buff2; 
   }

   sign = (buff & 0x80000000) ? -1 : 1;
   mask = 0x00400000;
   exponent = (buff & 0x7F800000) >> 23;
   bitval = 0.5;
   for(i=0;i<significandbits;i++)
   {
     if(buff & mask)
        fnorm += bitval;
     bitval /= 2;
     mask >>= 1;
   }
   if(exponent == 0 && fnorm == 0.0)
     return 0.0f;
   shift = exponent - ((1 << (expbits - 1)) - 1); /* exponent = shift + bias */

   if(shift == 128 && fnorm != 0.0)
     return (float) sqrt(-1.0);
   if(shift == 128 && fnorm == 0.0)
   {
#ifdef INFINITY
     return sign == 1 ? INFINITY : -INFINITY;
#endif
     return (sign * 1.0f)/0.0f;
   }
   if(shift > -127)
   {
     answer = ldexp(fnorm + 1.0, shift);
     return (float) answer * sign;
   }
   else
   {
     if(fnorm == 0.0)
     {
       return 0.0f;
     }
     shift = -126;
     while (fnorm < 1.0)
     {
         fnorm *= 2;
         shift--;
     }
     answer = ldexp(fnorm, shift);
     return (float) answer * sign;
   }
}


int fwriteieee754f(float x, FILE *fp, int bigendian)
{
    int shift;
    unsigned long sign, exp, hibits, buff;
    double fnorm, significand;
    int expbits = 8;
    int significandbits = 23;

    /* zero (can't handle signed zero) */
    if (x == 0)
    {
        buff = 0;
        goto writedata;
    }
    /* infinity */
    if (x > FLT_MAX)
    {
        buff = 128 + ((1 << (expbits - 1)) - 1);
        buff <<= (31 - expbits);
        goto writedata;
    }
    /* -infinity */
    if (x < -FLT_MAX)
    {
        buff = 128 + ((1 << (expbits - 1)) - 1);
        buff <<= (31 - expbits);
        buff |= (1 << 31);
        goto writedata;
    }
    /* NaN - dodgy because many compilers optimise out this test, but
    *there is no portable isnan() */
    if (x != x)
    {
        buff = 128 + ((1 << (expbits - 1)) - 1);
        buff <<= (31 - expbits);
        buff |= 1234;
        goto writedata;
    }

    /* get the sign */
    if (x < 0) { sign = 1; fnorm = -x; }
    else { sign = 0; fnorm = x; }

    /* get the normalized form of f and track the exponent */
    shift = 0;
    while (fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while (fnorm < 1.0) { fnorm *= 2.0; shift--; }

    /* check for denormalized numbers */
    if (shift < -126)
    {
        while (shift < -126) { fnorm /= 2.0; shift++; }
        shift = -1023;
    }
    /* out of range. Set to infinity */
    else if (shift > 128)
    {
        buff = 128 + ((1 << (expbits - 1)) - 1);
        buff <<= (31 - expbits);
        buff |= (sign << 31);
        goto writedata;
    }
    else
        fnorm = fnorm - 1.0; /* take the significant bit off mantissa */

    /* calculate the integer form of the significand */
    /* hold it in a  double for now */

    significand = fnorm * ((1LL << significandbits) + 0.5f);


    /* get the biased exponent */
    exp = shift + ((1 << (expbits - 1)) - 1); /* shift + bias */

    hibits = (long)(significand);
    buff = (sign << 31) | (exp << (31 - expbits)) | hibits;

writedata:
    /* write the bytes out to the stream */
    if (bigendian)
    {
        fputc((buff >> 24) & 0xFF, fp);
        fputc((buff >> 16) & 0xFF, fp);
        fputc((buff >> 8) & 0xFF, fp);
        fputc(buff & 0xFF, fp);
    }
    else
    {
        fputc(buff & 0xFF, fp);
        fputc((buff >> 8) & 0xFF, fp);
        fputc((buff >> 16) & 0xFF, fp);
        fputc((buff >> 24) & 0xFF, fp);
    }
    return ferror(fp);
}

Let me first clear the issue with your code. You are using strncpy which stops the copy the moment it sees '\\0'. Which simply means that it is not copying all your data.

And thus the 0 is expected.

Using memcpy instead of strncpy should do the trick.

I just tried this C++ code

int main(){
        float f = 3.34;
        printf("before = %f\n", f);
        char a[10];
        memcpy(a, (char*) &f, sizeof(float));
        a[sizeof(float)] = '\0'; // For sending over network
        float f1 = 1.99;
        memcpy((char*) &f1, a, sizeof(float));
        printf("after = %f\n", f1);
        return 0;
}

I get the correct output as expected.

Now coming to the correctness. I am not sure if this classifies as Undefined Behaviour. It could also be called a case of type punning, in which case it would be implementation defined (and I assume any sane compiler would not muck this).

This is all okay as long as I am doing it for the same program.

Now for your problem of sending it over network. I don't think this would be the correct way of doing it. Like @Jean-Baptiste Yunès mentioned, both the systems could be using different representations for float, or even different ordering for bytes.

In that case you need to use a library to convert it to some standard representation like IEEE 754.

The main problem is that C++ do not enforce IEEE754, so the representation of your float may work between 2 computers and fail with another.

The problem have to be divided into two:

  1. How to encode and decode a float to shared format
  2. How to serialize the value to a char array for transmission.

How to encode/decode a float to a common format

C++ does not impose a specific bit-format, this mean a computer might transfer a float and the value on the other machine would be different.

Example of 1.0f

Machine1: sign + 8bit Exponent + 23bit mantissa: 0-01111111-00000000000000000000000

Machine2: sign + 7bit exponent + 24bit mantissa: 0-0111111-000000000000000000000000

Sending from machine 1 to machine 2 without shared format, would result in machine 2 receiving: 0-0111111-100000000000000000000000 = 1.5

This is a complex topic and may be difficult to solve completely cross-platform. C++ includes some convenience properties helping somehow with this:

bool isIeee754 = std::numeric_limits<float>::is_iec559;

The main problem is that the compiler may not know about the exact CPU architecture on which its output will run. So this is half reliable. Fortunately, the bit format is in most of the case correct. Additionally, if the format is not known, it may be very difficult to normalize it.

We might design some code to detect the float format, or we might decide to skip those cases as "unsupported platforms".

In the case of the IEEE754 32bit, we may easily extract Mantissa, Sign and Exponent with bitwise operations:

float input;
uint8_t exponent = (input>>23)&0xFF;
uint32_t mantissa = (input&0x7FFFFF);
bool sign = (input>>31);

A standard format for transmission could well be the 32 bit IEEE754, so it would work in most of the times without even encoding:

bool isStandard32BitIeee754( float f)
{
    // TODO: improve for "special" targets.
    return std::numeric_limits<decltype(f)>::is_iec559 && sizeof(f)==4;
}

Finally, and especially for those non-standard platforms, it is required to keep special values for NaN and infinite.

Serialization of a float for transmission

The second issue is much simpler, it is just required to transform the standardized binary to a char array, however, not all characters may be acceptable on network, especially if it is used in HTTP protocol or equivalent.

For this example, I will convert the stream to hexadecimal encoding (an alternative could be Base64, etc..).

Note: I know there are some function which may help, I deliberately use simple C++ to show the steps at a level as lower as possible.

void toHex( uint8_t &out1, uint8_t &out2, uint8_t in)
{
    out1 = in>>4;
    out1 = out1>9? out1-10+'A' : out1+'0';
    out2 = in&0xF;
    out2 = out2>9? out2-10+'A' : out2+'0';
}

void standardFloatToHex (float in, std::string &out)
{
    union Aux
    {
        uint8_t c[4];
        float f;
    };
    out.resize(8);
    Aux converter;
    converter.f = in;
    for (int i=0; i<4; i++)
    {
        // Could use std::stringstream as an alternative.
        uint8_t c1, c2, c = converter.c[i];
        toHex(c1, c2, c);
        out[i*2] = c1;
        out[i*2+1] = c2;
    }
}

Finally, the equivalent decoding is required in the opposite side.

Conclusion

The standardization of the float value into a shared bit format has been explained. Some implementation-dependent conversions may be required.

The serialization for most common network protocols is shown.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM