简体   繁体   中英

C++, bad performance when instantiating a std::vector

I have a question regarding the instantiation of std::vector. I compare instantiation of an std::vector and a dynamic allocation of an array of the same size. I was expecting that the instantiation of the std::vector would take a little bit longer but I have a huge difference performance.

For the array I have 53 us For the std::vector I have 4338 us

my code:

#include <chrono>
#include <vector>
#include <iostream>

int main() {
    unsigned int NbItem = 1000000 ;
    std::chrono::time_point<std::chrono::system_clock> start, middle ,end;
    start = std::chrono::system_clock::now() ;
    float * aMallocArea = (float *)calloc(sizeof(float)*NbItem,0) ;
    middle = std::chrono::system_clock::now() ;
    std::vector<float> aNewArea ;
    middle = std::chrono::system_clock::now() ;
    aNewArea.resize(NbItem) ;
    //float * aMallocArea2 = new float[NbItem];
    end = std::chrono::system_clock::now() ;
    std::chrono::duration<double> elapsed_middle = middle-start;
    std::chrono::duration<double> elapsed_end = end-middle;
    std::cout << "ElapsedTime CPU  = " << elapsed_middle.count()*1000000 << " (us) " << std::endl ;
    std::cout << "ElapsedTime CPU  = " << elapsed_end.count()*1000000 << " (us) " << std::endl ;
    free(aMallocArea) ;
    return 0;
}

Even if I create a vector of size 0 I have this difference. Do you know why I have such bad performance when I am instantiating a std::vector ? Do you know how to improve this (I tried to use compilation option -O3 but it does not give outstanding result).

Compilation line: g++ --std=c++11 -o test ./src/test.cpp

compilator version: g++ --version g++ (Debian 4.7.2-5) 4.7.2 Copyright (C) 2012 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Do you realize that this:

float * aMallocArea = (float *)calloc(sizeof(float)*NbItem, 0);

means "Allocate sizeof(float)*NbItem items which have the size of zero"? This means that the call performs an allocation of zero bytes.

Even once you do correct this, the calloc form will be much faster in many cases. calloc implementations are capable of "reserving" a memory domain and returning a pointer. When you access the memory, the OS maps the virtual memory.

A vector on the other hand, actually goes through and initializes/constructs its elements. No implementation I know of checks to see that a) the type is POD, b) memory is zero, and c) that the allocator returns zeroed memory. So this initialization process can cost quite a bit, compared to calloc .

So the "C" version does next to nothing (if you fix your program), and the "C++" version goes through, initializes every element, and touches all the memory in the allocation. It will be much slower.

That is very rarely a good reason to favor the C version, even where performance matters. In practice, you should only allocate memory you actually need. Once you start using the memory for something, the times will even out (eg in the C version, it will take time to map the memory when you access it later on). If you were to create a second timed test which (say) computed the average of the arrays' elements, the C++ version would likely be faster on your implementation because the memory is already mapped and initialized, whereas the C version would perform mapping and initialization as you read the memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM