简体   繁体   中英

how much work is done by calling vector.size()?

Is it a simple getter? or is it calculating every time?

If I have a for loop like this:

for (int i = 0; i < myVector.size(); i++) { }

where I only need to use the size once, would it be better to calculate this once before the loop and store it in a variable? Or is size() just a simple getter and it would make little difference?

That is implementation defined. Assuming that implementations want to provide efficient standard libraries, the two most probable implementations are:

  • subtracting the pointer pointing to the beginning of the storage from the pointer pointing to one after the end of the storage, and returning the result of that computation.
  • returning a size variable that is kept in sync for every operation in the vector.

In any case the standard requires the complexity to be constant, and you should not really worry about it. Also compilers often optimize enough to make no difference to storing your size on your own.

Use iterators instead of such a loop for more flexibility and genrosity:

std::vector<int>::const_iterator iter = myVector.begin();

for(iter; iter!= myVector.end(); ++iter)
{

}

This ensures that if you change your container at later stage of development you have less tightly coupled code.

In every implementation I've looked at, it's calculated every time - in O(1) - by subtracting begin from end. So, it makes very little difference. The compiler may or may not optimise away the "calculation", but the time is so miniscule as to be irrelevant for all but the most desperate sitatuations.

Use whatever's more maintainable in your opinion: whatever reads best, localises the logic best etc.. It's arguably noteworthy that in using size() you ensure you accurately track the end of the vector even if there are deletions or insertions made inside the loop body, so there can be functional reasons to prefer (or avoid) calling size() repeatedly.

Note that for STL lists size() is NOT constant time - it's linear.

http://www.cplusplus.com/reference/stl/vector/size/ suggests that the complexity of the operation is constant. This means that the size won't be recalculated each time you call size() , or that the overhead is in most cases negligible. So using it like it's in your code is okay: this won't be the bottleneck of your code. (Unless this is a tight inner loop, where every optimization is going to bring a lot.)

Moreover, you'll be on a safer (but not completely safe!) side if the vector is going to be changed inside the loop.

If you look at the implementation of vector::size()

size_type size() const
{   // return length of sequence
        return (_Mylast - _Myfirst);
}

where _Mylast and _Myfirst are pointers.

Its just a pointer arithmetic and should be an atomic operation. Hence it shouldn't be an overhead!

Perhaps you are actually wondering whether the compiler can figure out that the value returned by the size() function never changes and hoist it out of the loop. While this seems sensible, the compiler would have to be able to prove that the hoisting produces the exact same behaviour as the version you have written (ie "as if" size() was called every time), and there are limits to how far the compiler can see, or indeed to how far this is actually provable.

Therefore, if you know that you aren't changing the size of the container, you should write the hoist manually:

// index version:
for (std::size_t i = 0; end = v.size(); i != end; ++i) { work_with(v[i]); }

// iterator version:
for (auto it = v.begin(), end = v.end(); it != end; ++it) { work_with(*it); }

Calls to std::vector::size() have constant complexity, ie their cost is independent of the number of elements in the vector, but there may be a cost nonetheless. In the best circumstances you are simply reading a variable (so there's no difference in the hoisted and non-hoisted version of loops bar one copy), but it's also feasible that the implementation performs a pointer subtraction.

Computing the size() for a std::vector takes constant time but the computation may be more involved than the difference between two pointers. Although this seems like a cheap computation it adds constraints on the available registers. Also, other implementations of std::vector take even more effort in size() but make other operations more efficient.

For example, for many use cases it is better to represent a std::vector with just one pointer to the start of the range and the control data prepended to this range. Oddly enough this happens how memory allocated using an array allocation is often layed out anyway. The control data would contain two pointers (one to the end of the range, the other to the capacity). Getting the size in this case consists of an address adjustment (with fixed size, though), a pointer dereference, and a subtraction. Still not a lot but the less you do in a [tight] loop, the faster it runs (in most cases).

The upshot is: get the size() and the end() just once unless you container is changing its size. If it is changing the size it depends on the kind of container whether you should recompute things: eg for a std::list you dont need to get the end() again (and you don't want to get its size() at all inless you absolutely have to). I seem to recall that Scott Meyers has a discussion of this topic in "Effective STL" but I don't recall whether he says what I just did (or if he is just wrong ;).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM