简体   繁体   中英

Binary Floating Point Addition

How does 1.000(base2) x 2^-1 + (-0.111(base2) x 2^-1) = .001(base2) x 2^-1? To add binary numbers don't you simply just add? I'm not seeing how the addition works..

I'm not sure what you mean when you ask "don't you simply just add?", but the math is correct. It is basically in base-2 scientific notation.

1.000(base2) x 2^-1 = 0.100(base2)
-0.111(base2) x 2^-1 = -0.0111(base2)

0.100 + (-0.0111) = 0.0001

0.0001 = 0.001(base2) x 2^-1

Things are a lot more complicated with floating point numbers. Let's start with integers.

To turn a positive number into a negative, you invert all the bits and add one. This is called "two's complement" arithmetic. -0111 becomes 11111001 if we use 8-bit numbers for our example.

Now when you add up the numbers, 00001000+11111001=100000001 . The overflow from the upper-most bit gets thrown away, leaving you with 00000001 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM