Is it safe to take the difference of two size_t objects?

Question

I'm investigating a standard for my team around using size_t vs int (or long , etc). The biggest drawback I've seen pointed out is that taking the difference of two size_t objects can cause problems (I'm unsure of specific problems -- maybe something wasn't 2s complemented and the signed/unsigned angers the compiler). I wrote a quick program in C++ using the V120 VS2013 compiler that allowed me to do the following:

#include <iostream>

main()
{
    size_t a = 10;
    size_t b = 100;
    int result = a - b;
}

The program resulted in -90 , which although correct, makes me nervous about type mismatches, signed/unsigned problems, or just plain undefined behavior if the size_t happens to get used in complex math.

My question is if it's safe to do math with size_t objects, specifically, taking the difference? I'm considering using size_t as a standard for things like indexes. I've seen some interesting posts on the topic here, but they don't address the math issue (or I missed it).

What type for subtracting 2 size_t's?

typedef for a signed type that can contain a size_t?

Answer 1

This is not guaranteed to work portably, but is not UB either. The code must run without error, but the resulting int value is implementation defined. So as long as you are working on platforms that guarantee the desired behavior, this is fine (as long as the difference can be represented by an int of course), otherwise, just use signed types everywhere (see last paragraph).

Subtracting two std::size_t s will yield a new std::size_t ^† and its value will be determined by wrapping. In your example, assuming 64 bit size_t , a - b will equal 18446744073709551526 . This does not fit into an (commonly used 32 bit) int , so an implementation defined value is assigned to result .

To be honest, I would recommend to not use unsigned integers for anything but bit magic. Several members of the standard committee agree with me: https://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything 9:50, 42:40, 1:02:50

Rule of thumb (paraphrasing Chandler Carruth from the above video): If you could count it yourself, use int , otherwise use std::int64_t .

^† Unless its conversion rank is less than int , eg if std::size_t is unsigned short . In that case, the result is an int and everything will work fine (unless int is not wider than short ). However

I do not know of any platform that does this.
This would still be platform specific, see first paragraph.

Answer 2

If you don't use size_t , you are screwed: size_t is the one type that exists to be used for memory sizes, and which is consequently guaranteed to always be big enough for that purpose. ( uintptr_t is quite similar, but it's neither the first such type, nor is it used by the standard libraries, nor is it available without including stdint.h .) If you use an int , you can get undefined behavior when your allocations exceed 2GiB of address space (or 32kiB if you are on a platform where int is only 16 bits!), even though the machine has more memory and you are executing in 64 bit mode.

If you need a difference of size_t that may become negative, use the signed variant ssize_t .

Answer 3

The size_t type is unsigned. The subtraction of any two size_t values is defined-behavior

However, firstly, the result is implementation-defined if a larger value is subtracted from a smaller one. The result is the mathematical value, reduced to the smallest positive residue modulo SIZE_T_MAX + 1 . For instance if the largest value of size_t is 65535, and the result of subtracting two size_t values is -3, then the result will be 65536 - 3 = 65533. On a different compiler or machine with a different size_t , the numeric value will be different.

Secondly, a size_t value might be out of range of the type int . If that is the case, we get a second implementation-defined result arising from the forced conversion. In this situation, any behavior can apply; it just has to be documented by the implementation, and the conversion must not fail. For instance, the result could be clamped into the int range, producing INT_MAX . A common behavior seen on two's complement machines (virtually all) in the conversion of wider (or equal width) unsigned types to narrower signed types is simple bit truncation: enough bits are taken from the unsigned value to fill the signed value, including its sign bit.

Because of the way two's complement works, if the original arithmetically correct abstract result itself fits into int , then the conversion will produce that result.

For instance, suppose that the subtraction of a pair of 64 bit size_t values on a two's complement machine yields the abstract arithmetic value -3, which is becomes the positive value 0xFFFFFFFFFFFFFFFD . When this is coerced into a 32 bit int , then the common behavior seen in many compilers for two's complement machines is that the lower 32 bits are taken as the image of the resulting int : 0xFFFFFFFD . And, of course, that is just the value -3 in 32 bits.

So the upshot is, that your code is de facto quite portable because virtually all mainstream machines are two's complement with conversion rules based on sign extension and bit truncation, including between signed and unsigned.

Except that sign extension doesn't occur when a value is widened while converting from unsigned to signed. Thus he one problem is the rare situation in which int is wider than size_t . If a 16 bit size_t result is 65533, due to 4 being subtracted from 1, this will not produce a -3 when converted to a 32 bit int ; it will produce 65533!

Is it safe to take the difference of two size_t objects?

Question

3 answers

solution1
15 ACCPTED 2016-01-06 21:12:45

solution2
5 2016-01-06 21:21:22

solution3
4 2016-01-06 21:36:27

Is it safe to take the difference of two size_t objects?

Question

3 answers

solution1 15 ACCPTED 2016-01-06 21:12:45

solution2 5 2016-01-06 21:21:22

solution3 4 2016-01-06 21:36:27

solution1
15 ACCPTED 2016-01-06 21:12:45

solution2
5 2016-01-06 21:21:22

solution3
4 2016-01-06 21:36:27