简体   繁体   中英

How can I fill a std::unorderd_map concurrently?

I need to fill a std::unordered_map<int,T> with about 100 entries. Those are expensive to construct and I would like to use OpenMP to do that concurrently:

unordered_map<int, T> mapWithTs;

#pragma omp parallel for schedule(dynamic) // dynamic because T constructs in some unpredictable time.
for(int i=0; i<100; ++i)
{
  mapWithTs.emplace(i, {i}) // calls the constructor T(i)
}

I read that the map will rehash and then iterators will no longer be valid. What must I do to make this work?

Further, what would the concurrency solution with the standard library look like?

In case these expensive to construct instances are help by reference, ie by shared_ptr, raw pointer, &c., I suggest letting each thread create its own, stack-local map, in a step canonically also called "map", and then combine them all in a single thread in a step canonically called "reduce".

This is called the "map-reduce" algorithm.

"map" is the usual name of a function that applies a function to all elements of a collection

"reduce" is the usual name of a function that combines all elements in a collection to a single value by calling a function with the current intermediate result and each element

hence the name :)

As noted by Galik and yeoman, it is essential to make moving of your objects a cheap operation. If it already is (construction is heavy, but moving is cheap), then you are fine. Otherwise you should put your objects in-to a uniq_ptr 's. After this rehash will be a cheap operation too (yes, rehash takes linear time, but it is 0(1) amortized complexity). So you don't need to worry about rehash.

The next thing is filling the map. You are working with it from several threads so you have to ensure that no more than one thread works with it simultaneously. You need something like #pragma omp critical or std::mutex. And here comes important part: if you use emplace like you do it now, then the heavy T constructor will be executed under critical section which kills whole idea of parallelization. So in this particular case you would prefer creating object T beforehand, then entering critical section and moving the object in-to a hashmap.

If construction of the T is really a heavy operation (it takes much longer, then inserting a value in-to a unordered_map), then that will be all. You won't get performance gain by making per thread lists and inserting them in-to the map. Otherwise, yeoman's answer can give you additional benefits by the cost of increasing complexity of your code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM