简体   繁体   中英

How many objects iterable in a vector before L3 cache misses occur?

Lets say I have an class which contains data members with heap-allocated memory:

class X{
    std::map<int, double> a;
    std::set<int> b;
    std::vector<int>;
    std::string c;
}

and I have a std::vector<shared_ptr<X>> containing many of these X objects, which I will iterate through and access the map.begin():

for(int i =0; i<vec.size(); i++){
    running_total += *(vec[i]->a.begin());
}

Theoretically how many objects should I be able to hold/iterate through in the vector before I encounter L3 cache misses?

I thought the answer would be how many cache lines per object the L3 cache could hold, but L3 size/sizeof(x_element.get()) doesn't seem to be giving me the answer I am seeing from profiling....

My L3 cache is 8MB, each cache line is 64 bytes and therefore I could hold about 125,000 objects before L3 cache misses. However I am seeing L3 cache misses at much fewer numbers of vector elements.

On Intel CPUs you can use Intel Architecture Code Analyzer (IACA) for analyzing your loop. If I remember correctly it can also analyze cache misses if you configure it properly, etc.

Another tool is Valgrind which is a simulator which can also be used to simulate cache behaviour if you configure it correctly.

But in general - to maximize the cache usage - you should separate out the data that you iterate over in one linear array (and as small as possible). Eg one array with keys (or data you iterate over) and one array with the rest if possible. So in short the cache really kicks in only if the addresses of the data you iterate over is ordered linearly and NOT random access as you will get if you iterate over many objects allocated on different places on the heap.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM