简体   繁体   中英

floating point binary portability

I have a library that saves to disk loads of floating point data in text form. It seems they've done this because of portability matters, but because of huge disk usage from this, I've written a function to save the binary representation of floating points directly to disk. I know this doesn't guarantee 100% portability, but I'll run this only on x86(_64) Linux/Windows PC's (maybe also in Mac and BSDs).

Is there a way to at least check whether the floating point format the program understands is also okay with the system? And how much of incompatibility should I expect from dealing with floating point data in binary form?

Is there a way to at least check whether the floating point format the program understands is also okay with the system?

Test 1: sizeof. Test 2: save a magic floating point value in the header of your on-disk file and check in the program that it has the right value after you've read the binary data from the disk. This should be safe enough.

And how much of incompatibility should I expect from dealing with floating point data in binary form?

Very little. If, as you're saying, you're staying with just one hardware architecture (x86), you'll be fine. If you have a limited set of supported architectures - just test all of them. On x86 everyone will be using hardware floating point which limits how creative they can be (pretty much not at all). Even between architectures everyone I know of who uses IEEE 754 floating point has the same binary representation for the same endianness.

Floating point have the weird problem that there isn't a widely used standard for their binary on disk/on wire representation. That being said, everyone who I've looked at does one of two things: either strings or store the bit pattern in an equally sized integer, adjust for endianness, brutally cast to float.

Look up the binary portability website. https://github.com/MalcolmMcLean/ieee754

The function to write an IEEE 754 portably is quite long, but it's just a cut and paste job. There's also a float version.

/*
* write a double to a stream in ieee754 format regardless of host
*  encoding.
*  x - number to write
*  fp - the stream
*  bigendian - set to write big bytes first, elee write litle bytes
*              first
*  Returns: 0 or EOF on error
*  Notes: different NaN types and negative zero not preserved.
*         if the number is too big to represent it will become infinity
*         if it is too small to represent it will become zero.
*/
int fwriteieee754(double x, FILE *fp, int bigendian)
{
    int shift;
    unsigned long sign, exp, hibits, hilong, lowlong;
    double fnorm, significand;
    int expbits = 11;
    int significandbits = 52;

    /* zero (can't handle signed zero) */
    if (x == 0)
    {
        hilong = 0;
        lowlong = 0;
        goto writedata;
    }
    /* infinity */
    if (x > DBL_MAX)
    {
        hilong = 1024 + ((1 << (expbits - 1)) - 1);
        hilong <<= (31 - expbits);
        lowlong = 0;
        goto writedata;
    }
    /* -infinity */
    if (x < -DBL_MAX)
    {
        hilong = 1024 + ((1 << (expbits - 1)) - 1);
        hilong <<= (31 - expbits);
        hilong |= (1 << 31);
        lowlong = 0;
        goto writedata;
    }
    /* NaN - dodgy because many compilers optimise out this test, but
    *there is no portable isnan() */
    if (x != x)
    {
        hilong = 1024 + ((1 << (expbits - 1)) - 1);
        hilong <<= (31 - expbits);
        lowlong = 1234;
        goto writedata;
    }

    /* get the sign */
    if (x < 0) { sign = 1; fnorm = -x; }
    else { sign = 0; fnorm = x; }

    /* get the normalized form of f and track the exponent */
    shift = 0;
    while (fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while (fnorm < 1.0) { fnorm *= 2.0; shift--; }

    /* check for denormalized numbers */
    if (shift < -1022)
    {
        while (shift < -1022) { fnorm /= 2.0; shift++; }
        shift = -1023;
    }
    /* out of range. Set to infinity */
    else if (shift > 1023)
    {
        hilong = 1024 + ((1 << (expbits - 1)) - 1);
        hilong <<= (31 - expbits);
        hilong |= (sign << 31);
        lowlong = 0;
        goto writedata;
    }
    else
        fnorm = fnorm - 1.0; /* take the significant bit off mantissa */

    /* calculate the integer form of the significand */
    /* hold it in a  double for now */

    significand = fnorm * ((1LL << significandbits) + 0.5f);


    /* get the biased exponent */
    exp = shift + ((1 << (expbits - 1)) - 1); /* shift + bias */

    /* put the data into two longs (for convenience) */
    hibits = (long)(significand / 4294967296);
    hilong = (sign << 31) | (exp << (31 - expbits)) | hibits;
    x = significand - hibits * 4294967296;
    lowlong = (unsigned long)(significand - hibits * 4294967296);

writedata:
    /* write the bytes out to the stream */
    if (bigendian)
    {
        fputc((hilong >> 24) & 0xFF, fp);
        fputc((hilong >> 16) & 0xFF, fp);
        fputc((hilong >> 8) & 0xFF, fp);
        fputc(hilong & 0xFF, fp);

        fputc((lowlong >> 24) & 0xFF, fp);
        fputc((lowlong >> 16) & 0xFF, fp);
        fputc((lowlong >> 8) & 0xFF, fp);
        fputc(lowlong & 0xFF, fp);
    }
    else
    {
        fputc(lowlong & 0xFF, fp);
        fputc((lowlong >> 8) & 0xFF, fp);
        fputc((lowlong >> 16) & 0xFF, fp);
        fputc((lowlong >> 24) & 0xFF, fp);

        fputc(hilong & 0xFF, fp);
        fputc((hilong >> 8) & 0xFF, fp);
        fputc((hilong >> 16) & 0xFF, fp);
        fputc((hilong >> 24) & 0xFF, fp);
    }
    return ferror(fp);
}

您可以在标题<float.h>第46页:5.2.4.2.2浮动类型的特征中查看new(C11)和old宏。

In general you should be fine to directly read and write the binary data: the IEEE754 binary interchange format is pretty much standard outside of a few niche areas. You can use the __STDC_IEC_559__ macro to check.

As noted in this question , one thing the spec does not specify is the precise mapping of bits to bytes, so there is potential for endianness issues (though probably not if you're exclusively using x86/x86_64). It might be a good idea to include a check floating point value at the start of your stream (note that it is not sufficient to check the endianness of your integers, as it is technically possible to have different endianness for integer and floating point).

If you're writing text, one alternative to consider is the hex float format , which can be much faster to read/write than decimal formats (though not as fast as the raw binary interchange format). Unfortunately, though it is part of both the IEEE and C-99 spec, it has been poorly supported by the MSVC compiler (though this may change now it is part of C++).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM