简体   繁体   中英

Double precision floating-point comparison

I'm a little confused here- would comparison of doubles still work correctly when they're stored as opaque (binary) fields? The problem I'm facing is the fact that the double includes a leading bit for the sign (ie positive or negative) and when they're stored as binary data I'm not sure it will be compared correctly:

在此处输入图片说明

I want to ensure that the comparison will work correctly, because I'm using a double as a part of a key tuple (eg ) in LevelDB and I want to preserve the data locality for positive and negative numbers. LevelDB only uses opaque fields as keys, but it does allow the user to specify his/her own comparator. However, I just want to make sure that I don't specify a comparator unless I absolutely need to:

// Three-way comparison function:
//   if a < b: negative result
//   if a > b: positive result
//   else: zero result
inline int Compare(const unsigned char* a, const unsigned char* b) const 
{
    if (*(double*)a < *(double*)b) return -1;
    if (*(double*)a > *(double*)b) return +1;
    return 0;
}

Making my comments an answer.

There are two things that could go wrong:

  1. If either (or both) parameters is NAN , comparisons will always return false. So even if the binary representation is the same, NAN == NAN will always be false. Furthermore, it violates comparison transitivity.

  2. If either parameter isn't properly aligned (since they are char pointers), you could run into problems on machines that don't support misaligned memory access. And for those that do, you may encounter a performance hit.

So to get around this problem, you'll need to add a trap case that will be invoked if either parameter turns out to be NAN . (I'm not sure on the status of INF .)

Because of the need for this trap case, you will need to define your own comparison operator.

Yes, you have to specify your own comparison function. This is because doubles are not necessarily stored as 'big-endian' values. The exponent will not reside in memory before the mantissa even though logically it appears before the mantissa when the value is written out in big-endian format.

Of course, if you're sharing stuff between different CPU architectures in the same database, you may end up with weird endian problems anyway just because you stored stuff as binary blobs.

Lastly, even if you could control for endianness I would still not trust it. For example, if a double is not normalized it may not compare correctly to another double when compared as binary data.

Of course, everything the other person said about alignment and odd values like NAN and INF are important to pay attention to when writing a comparison function. But, as far as whether you should write one at all, I would have to say that it would be a really good idea.

I assume that your number format conforms to the IEEE 754 standard. If that's the case, then a simple signed-integer comparison won't work -- if both numbers are negative, the result of the comparison is reversed. So you do have to provide your own comparator.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM