How mathematical operations are stored in variables in c?

Question

I've been using Computer Systems: A Programmer's Perspective to know more about computer architecture. Today, I was studying unsigned and signed addition and the handling of overflows in both cases.

I didn't have a problem with unsigned addition, so signed addition wasn't a problem because it was basically unsigned addition with conversions from signed to unsigned and unsigned to signed, also stated in the book as:

The w-bit two's-complement sum of two numbers has the exact same bit-level representation as the unsigned sum. In fact, most computers use the same machine instruction to perform either unsigned or signed addition.

Also with the formula, where w represents the word size:

$x + _{w}^{t}\textrm{y} = U2T_{w}(T2U_{w}(x) + _{w}^{u}{T2U}_{w}(y))$

Coming from this I get to the exercises and there was this exercise with a buggy code that I need to analyze

int tadd_ok(int x, int y) {
    int sum = x+y;
    return (sum-x == y) && (sum-y == x);
}

In the answer, it says the operation sum-x in the sum-x == y causes an abelian group to form so it turns into an x+yx and y will be left no matter the sum value is, therefore, resulting to an incorrect evaluation.

So, here's the thing I wonder: I always thought, intuitively, that the value will be assigned after an operation to the variable, not the operation itself because if this weren't the case, the variable sum would just have the value of x+y , not the operation. Then it would result in something like some_value-x in the latter operation not causing an abelian group to form, so it would behave as expected. Instead, it carried the operation itself to the sum-x == y

Is this happening just because the value x+y cannot be stored in an int because of the size, so it just carries the operation, or does it always work like this, even when there's no potential overflow? If so, is it always the case for every other programming language?

Answer 1

For this:

int tadd_ok(int x, int y) {
    int sum = x+y;
    return (sum-x == y) && (sum-y == x);
}

What you'd expect is that for some values of x and y the addition will cause an overflow. For signed numbers, C does not assume anything about how signed numbers are represented (eg an implementation might use "twos complement" and might not) and signed integer overflows are undefined behavior (might wrap around, might format your hard drive). Because overflow is undefined behavior the compiler can do anything it likes, including assuming that it never happens (which is something that some compilers are known to do).

If the compiler does assume that the undefined behavior never happens (that x+y never overflows), then the compiler can also assume that (sum-x == y) && (sum-y == x) is always true, and the compiler can optimize the code into:

int tadd_ok(int x, int y) {
    return true;
}

In the answer, it says the operation sum-x in the sum-x == y causes an abelian group to form so it turns into an x+yx and y will be left no matter the sum value is, therefore, resulting to an incorrect evaluation.

That is "potentially possible in practice" (if the implementation uses "twos compliment" representation for signed integers, and if "undefined behavior" becomes wrapping/truncation) but it's not guaranteed, even if the underlying CPU uses "twos compliment" and wrapping/truncation.

So, here's the thing I wonder: I always thought, intuitively, that the value will be assigned after an operation to the variable, not the operation itself because if this weren't the case, the variable sum would just have the value of x+y, not the operation. Then it would result in something like some_value-x in the latter operation not causing an abelian group to form, so it would behave as expected. Instead, it carried the operation itself to the sum-x == y

For "potentially possible in practice (but not guaranteed)"; the value is assigned. For example, with 16-bit int , with something like tadd_ok(30000, 30000) the sum = x + y; would do sum = 30000 + 30000 = +60000 = too big to fit = -5536 due to wrapping and then sum - x = -5536 - 30000 = -35536 = too big to fit = +30000 due to wrapping = y .

Is this happening just because the value x+y cannot be stored in an int because of the size, so it just carries the operation, or does it always work like this, even when there's no potential overflow?

If it happens like this; then it happens because the first overflow (in sum = x + y; ) causes wrapping, followed by a second overflow (in sum - x ) and a third overflow (in sum - y ) which cause more wrapping that cancels out the wrapping caused by the first overflow.

If so, is it always the case for every other programming language?

It's not always the case for C. For example, with 16-bit int using "sign and magnitude" representation for signed integers, with something like tadd_ok(30000, 30000) the sum = x + y; could do sum = 30000 + 30000 = +60000 = too big to fit = +27232 due to wrapping of the magnitude part only and then sum - x = 27232 - 30000 = -2768 != y . Of course in this case (because it's undefined behaviour) the compiler could still always return true (as described at the start); and the same code might behave one way (with optimizer disabled) and another way (with optimizer enabled).

In general; "twos complement" is much more common and (for performance reasons) most languages don't do anything for overflows; so (without optimization) the "potentially possible in practice (but not guaranteed)" behavior isn't uncommon in other languages.

How mathematical operations are stored in variables in c?

Question

1 answers

solution1
3 ACCPTED 2020-12-01 03:16:53

How mathematical operations are stored in variables in c?

Question

1 answers

solution1 3 ACCPTED 2020-12-01 03:16:53

solution1
3 ACCPTED 2020-12-01 03:16:53