简体   繁体   中英

Why not enforce 2's complement in C++?

The new C++ standard still refuses to specify the binary representation of integer types. Is this because there are real-world implementations of C++ that don't use 2's complement arithmetic? I find that hard to believe. Is it because the committee feared that future advances in hardware would render the notion of 'bit' obsolete? Again hard to believe. Can anyone shed any light on this?

Background: I was surprised twice in one comment thread (Benjamin Lindley's answer tothis question ). First, from piotr's comment:

Right shift on signed type is undefined behaviour

Second, from James Kanze's comment:

when assigning to a long, if the value doesn't fit in a long, the results are implementation defined

I had to look these up in the standard before I believed them. The only reason for them is to accommodate non-2's-complement integer representations. WHY?

(Edit: C++20 now imposes 2's complement representation, note that overflow of signed arithmetic is still undefined and shifts continue to have undefined and implementation defined behaviors in some cases.)

  • A major problem in defining something which isn't, is that compilers were built assuming that is undefined. Changing the standard won't change the compilers and reviewing those to find out where the assumption was made is a difficult task.

  • Even on 2 complement machine, you may have more variety than you think. Two examples: some don't have a sign preserving right shift, just a right shift which introduce zeros; a common feature in DSP is saturating arithmetic, there assigning an out of range value will clip it at the maximum, not just drop the high order bits.

I suppose it is because the Standard says, in 3.9.1[basic.fundamental]/7

this International Standard permits 2's complement, 1's complement and signed magnitude representations for integral types.

which, I am willing to bet, came along from the C programming language, which lists sign and magnitude , two's complement , and one's complement as the only allowed representations in 6.2.6.2/2 . And there sure were 1's complement systems around when C was wide-spread: UNIVACs are the most often mentioned, it seems.

It seems to me that, even today, if you are writing a broadly-applicable C++ library that you expect to run on any machine, then 2's complement cannot be assumed. C++ is just too widely used to be making assumptions like that.

Most people don't write those sorts of libraries, though, so if you want to take a dependency on 2's complement you should just go ahead.

Many aspects of the language standard are as they are because the Standards Committee has been extremely loath to forbid compilers from behaving in ways that existing code may rely upon. If code exists which would rely upon one's complement behavior, then requiring that compilers behave as though the underlying hardware uses two's complement would make it impossible for the older code to run using newer compilers.

The solution, which the Standards Committee has alas not yet seen fit to implement, would be to allow code to specify the desired semantics for things in a fashion independent of the machine's word size or hardware characteristics. If support for code which relies upon ones'-complement behavior is deemed important, design a means by which code could expressly demand one's-complement behavior regardless of the underlying hardware platform. If desired, to avoid overly complicating every single compiler, specify that certain aspects of the standard are optional, but conforming compilers must document which aspects they support. Such a design would allow compilers for ones'-complement machines to support both two's-complement behavior and ones'-complement behavior depending upon the needs of the program. Further, it would make it possible to port the code to two's-complement machines with compilers that happened to include ones'-complement support.

I'm not sure exactly why the Standards Committee has as yet not allowed any way by which code can specify behavior in a fashion independent of the underlying architecture and word size (so that code wouldn't have some machines use signed semantics for comparisons where other machines would use unsigned semantics), but for whatever reason they have yet to do so. Support for ones'-complement representation is but a part of that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM