简体   繁体   中英

size_t, ptrdiff_t and std::vector::size()

I thought that the correct type to use to store the difference between pointers was ptrdiff_t .

As such, I'm confused by the way that my STL (msvc 2010) implements it's std::vector::size() function. The return type is size_t (this is mandated by the standard, as far as I understand it) and yet it's computed as the difference of pointers:

// _Mylast, _Myfirst are of type pointer
// size_type, pointer are inherited from allocator<_Ty>
size_type size() const 
{
    return (this->_Mylast - this->_Myfirst);
}

Obviously, there's a bit of meta-magic that goes on in order to determine exactly what types size_type and pointer are. In order to be "sure" what types they are I checked this:

bool bs = std::is_same<size_t, std::vector<int>::size_type>::value;
bool bp = std::is_same<int * , std::vector<int>::pointer>::value;
// both bs and bp evaluate as true, therefore:
//   size_type is just size_t
//   pointer is just int*

Compiling the following with /Wall gives me a signed-to-unsigned mismatch for mysize2 , but no warnings for mysize1 :

std::vector<int> myvector(100);
int *tail = &myvector[99];
int *head = &myvector[ 0];
size_t mysize1 = myvector.size();
size_t mysize2 = (tail - head + 1);

Changing the type of mysize2 to ptrdiff_t results in no warning. Changing the type of mysize1 to ptrdiff_t results in an unsigned-to-signed mismatch .

Obviously I'm missing something...

EDIT: I'm not asking how to suppress the warning, with a cast or a #pragma disable(xxx) . The issue I'm concerned about is that size_t and ptrdiff_t may have different allowable ranges (they do on my machine).

Consider std::vector<char>::max_size() . My implementation returns a max_size equal to std::numeric_limits<size_t>::max() . Since vector::size() is creating an intermediate value of type ptrdiff_t before casting to size_t it seems that there could be problems here - ptrdiff_t is not big enough to hold vector<char>::max_size() .

Generally speaking, ptrdiff_t is a signed integral type of the same size as size_t. It must be signed so that it can represent both p1 - p2 and p2 - p1 .

In the specific case of the internals of std::vector, the implementor is effectively deriving size() from end() - begin() . Because of the guarantees of std::vector (contiguous, array based storage), the value of the end pointer will always be greater than the value of the begin pointer, and thus there is no risk of generating a negative value. In fact, size_t will always be able to represent a larger positive range than will ptrdiff_t, as it doesn't have to use half its range to represent negative values. Effectively, this means that the cast in this case from ptrdiff_t to size_t is a widening cast, which has well defined (and intuitively obvious) results.

Also, note that this is not the only possible implementation of std::vector. It could just as easily be implemented as a single pointer and a size_t value holding the size, deriving end() as begin() + size() . That implementation would also resolve your max_size() concern. In reality, max_size is never actually attainable--it would require your program's entire address space to be allocated for the vector's buffer, leaving no room for the begin()/end() pointers, function call stack, etc.

There is nothing wrong with how std::vector::size() is implemented in STL. The this->_Mylast - this->_Myfirst == vector size is mere an coincidental fact which relies on how the vector is implemented.

Plus MSVC STL vector implementation has an #pragma warning(disable: 4244) which removes the warning.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM