简体   繁体   中英

Fast construction of unordered_map with nested vector with preallocation std::unordered_map<int, vector<Thing *>>?

I want to create map of int to vector of Things*. I know that Thing will be 1-50 no more. How can I allocate 50 at start to speed up construction of map?

I tried three methods but still not sure if it enough fast. Can you suggest better optimization? I was using c++ 10 years ago and I am not sure if I do it correctly. Can you help? All optimization suggestions are welcome. Code is simplified from real problem.

#include <iostream>
#include <vector>
#include <unordered_map>

#include <time.h>

class Thing {
};

int main()
{
    clock_t start;
    start = clock();
    auto int_to_thing = std::unordered_map<int, std::vector<Thing *>>();
    for (int i = 0; i < 1000; i++) {
        for (int j = 0; j < 25; j++) {
            int_to_thing[i].push_back(new Thing());
        }
    }
    for (int i = 0; i < 1000; i++) {
        for (int j = 0; j < 25; j++) {
            int_to_thing[i].push_back(new Thing());
        }
    }
    std::cout << (clock() - start) << std::endl;

    start = clock();
    int_to_thing = std::unordered_map<int, std::vector<Thing *>>();
    for (int i = 0; i < 1000; i++) {
        int_to_thing[i].reserve(50);
        for (int j = 0; j < 25; j++) {
            int_to_thing[i].push_back(new Thing());
        }
    }
    for (int i = 0; i < 1000; i++) {
        for (int j = 0; j < 25; j++) {
            int_to_thing[i].push_back(new Thing());
        }
    }
    std::cout << (clock() - start) << std::endl;

    start = clock();
    int_to_thing = std::unordered_map<int, std::vector<Thing *>>();
    for (int i = 0; i < 1000; i++) {
        auto it = int_to_thing.find(i);
        if (it != int_to_thing.end()) {
            auto v = std::vector<Thing *>(50);
            auto pair = std::pair<int, std::vector<Thing *>>(i, v);
            int_to_thing.insert(pair);
        }
    }
    for (int i = 0; i < 1000; i++) {
        for (int j = 0; j < 25; j++) {
            int_to_thing[i].push_back(new Thing());
        }
    }
    std::cout << (clock() - start) << std::endl;
    
    return 0;
}

Are you concerned about the construction of the map (then see @ShadowRanger's comment) or the construction of the vectors?

I assume that there are 1..50 Thing 's in a vector, NOT 1..50 vectors in a map.

Your code:

int_to_thing = std::unordered_map<int, std::vector<Thing *>>();
for (int i = 0; i < 1000; i++) {
    int_to_thing[i].reserve(50);

is the best option. It constructs a map of vectors and, inside the loop, creates each vector and pre-allocates room for 50 elements.

Without that reserve() you would likely encounter a couple of reallocation while pushing 50 elements into those vectors.

Using:

auto v = std::vector<Thing *>(50);

actually creates 50 elements in your vector,and default-initializes them. This may or may not cost you extra. Specifically, it will be cheap with your current use of pointers, and expensive if you switch to storing the Thing objects themselves.

If you are unsure that something is fast enough then you are not measuring performance and this is prima facie evidence that you don't care one iota about it. If you don't measure it then you cannot claim anything about it. Measure it first before you do anything else. Otherwise you'll waste everyone's time. You work on an assumption that such preallocations will help. I have an inkling that they won't help at all since you make so few of them, and you're just wasting your time. Again: if you are serious about performance, you stop now, get measurements in place, and come back with some numbers to talk over. And don't measure debug builds - only release builds with full optimization turned on, including link time code generation (LTCG). If you don't optimize, you don't care about performance either. Period. Full stop. Those are the rules.

Yes, you have code that times stuff but that's not what measurements are about. They need to happen in the context of your use of the data, so that you can see what relative overhead you have. If the task takes an hour and you spend a second doing this “unoptimally”, then there's no point in optimizing that first - you got bigger fish to fry first. And besides, in most context the code is cache-driven ie data access patterns determine performance so I don't believe you're doing anything useful at all at the moment. Such micro optimizations are totally pointless. This code doesn't exist in a vacuum. If it did, you can just remove it and forget about it all, right?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM