简体   繁体   中英

Unsigned int or iterator when using vector?

I am currently working on a c++ school project with some friends.

Before when i had vectors in c++ i did something like this to use them :

unsigned int i = 0;
while (i != myVector.size())
{
    doSomething(myVector[i]);
    i++;
}

But during this project my friends were not happy seeing me using vectors like this and asked me to use iterators. I don't really enjoy iterators because their syntax is quite hard to remember but my friends said that is was better using them because it works faster. And since we are working in a big project with a lot of vectors it was crucial using iterators.

Time has passed and i'm still using them even if i still can't remember their syntax but i wanted to see if the iterator method was really faster them the "unsigned int" method.

So i made this 2 programs:

First program using the unsigned int method:

#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::string str = "This is a string";

    int i = 0;
    std::vector<std::string> vec;

    while (i != 10000000)
    {
        vec.push_back(str);
        i++;
    }

    unsigned int j = 0;
    while (j != vec.size())
    {
        std::cout << vec[j] << std::endl;
        j++;
    }
    return (0);
}

And the second program using the iterator method:

#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::string str = "This is a string";

    int i = 0;
    std::vector<std::string> vec;

    while (i != 10000000)
    {
        vec.push_back(str);
        i++;
    }

    std::vector<std::string>::iterator it;
    it = vec.begin();
    while (it != vec.end())
    {
        std::cout << *it << std::endl;
        it++;
    }
    return (0);
}

As you can see both programs will first create a vector with a size of 10 000 000 (i've put a big size so if there's a difference in time it will be easier to notice) and then i will just print the string in the vector but using the two different methods.

I used time on linux to know the execution time of each program like this:

time ./a.out

And here's the result:

The unsigned int method:

real    0m39,391s
user    0m5,463s
sys     0m21,108s

The iterator method:

real    0m39,436s
user    0m5,972s
sys     0m20,652s

And ......... it's the same time ?! There's just a negligible less than 1 second difference between both and it's a vector with 10 million strings in it.

So i was wondering is there really a difference between this two methods and are iterators really better to use ?

The main reason to use iterators is not performance, but less possibilities for mistakes and more expressive code. Compare this

unsigned int i = 0;
while (i != myVector.size())
{
    doSomething(myVector[i]);
    i += 2;
}

or

unsigned int start = myVector.size() + 42;

for (unsigned int i = start; i != myVector.size(); ++i){
    doSomething(myVector[i]);
}

with

for (const auto& e : myVector) {
     doSomething(e);
}

Range based for loops makes using iterators as simple as it can get (you dont even see the iterators, but they are used behind the scenes). When you manually manage the index there are millions of ways to get it wrong, with iterators there are maybe 2 or 3.

For your performance comparison: As vectors store their element in contiguous memory, vector iterators can be plain pointers. What you think is overhead is mainly syntactic sugar to enable you to write nicer code. Hence its not a big surprise that you dont see much difference.

PS

i used it a lot i am kinda confident of not making too much mistakes

Using an integer to iterate an array is from the last century. Its unsafe, it leads to hard to detect bugs and can easily invoke undefined behaviour. Write code to express what you want to do, not to instruct your processor. If you want to do something for each element of a vector you should a range based for loop or the older std::for_each :

std::for_each(myVector.begin(),myVector.end(),doSomething);

It has not any of the downsides of manually using an index (did you spot the mistakes in the above loops?) and has the advantage of looking the same no matter what container myVector actually is or what type of element it contains, or what doSomething actually is (it can be a free function, a functor, a lambda, your choice).

Surprisingly enough, there might be at least theoretical performance difference when comparing iterator access with index access in the loop, as can be shown here: https://gcc.godbolt.org/z/frFYhF

Filtering for some noise, with iterator every iteration looks like

.LBB0_1:                                # =>This Inner Loop Header: Depth=1
    movl    (%rbx), %edi
    callq   check(int)
    addq    $4, %rbx
    cmpq    %rbx, %r14
    jne     .LBB0_1

So here we see one memory access, one math operation, and one conditional branch. Overall, memory access will dwarf everything else once you are out of the cache line, but still those are operations taken.

When we look into index access, iteration looks like:

.LBB1_3:                                # =>This Inner Loop Header: Depth=1
    movq    (%r14), %rax
    movl    (%rax,%rbx,4), %edi
    callq   check(int)
    addq    $1, %rbx
    cmpq    %r15, %rbx
    jb      .LBB1_3

Here we see something we do not see in the previous example - an extra register move on every iteration (which is needed for displacement memory access).

Now, a register move is probably one of the cheapest real operations CPU can perform, but it still an operation, and it would be a reordering block, as later operation depends on it's result.

I believe, the performance impact we see here should not be something you think about when you are accessing the vector. Instead, you should strike for uniformity, ease of read and maintainability.

Having said all that, I suggest you opt for range-based loop.

for (int i: vec) {
     // work with i
}

Last, but not the list, using a variable of unsigned int to iterate over indexes of a vector could be a nasty bug. On many platforms vector could be larger than maxim int allows, and you will end up with the endless loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM