Multithreading unordered_map

Question

I'm working in a multithreading environment. Basically, I have an unordered_map which could be accessed by multiple threads at the same time. Right now, my algorithm is:

function foo(key) {
  scoped_lock()
  if key exists {
    return Map[key]
  }

  value = get_value()
  Map[key] = value
}

Obviously the performance are not good with this implementation. Is there any algorithm/approach that I can use in order to improve performance?

EDIT:

I did a number of tests and I thought about the double-checked locking. So, I modified the code with:

function foo(key) {
  if key exists {
    return Map[key]
  }

  scoped_lock()
  if key exists {
    return Map[key]
  }

  value = get_value()
  Map[key] = value
}

Actually I only add another check before the scoped_lock(). In this scenario, suppose function is called N times. If the first m calls to foo , where m < N , filled the map and the next N - m calls only get values from the map, I don't need the exclusive access. Moreover, there is another check after the scoped_lock which ensures thread safety. Am I right? In any case, with the first code the execution needs ~208s, while the second one ~200s.

Answer 1

Here is a utility class:

template<class T, class M=std::mutex, template<class...>class S=std::unique_lock, template<class...>class U=std::unique_lock>
struct mutex_protected {
  template<class F>
  auto read( F&& f ) const
  -> typename std::result_of<F&&(T const&)>::type
  {
    auto l = lock();
    return std::forward<F>(f)(data);
  }
  template<class F>
  auto write( F&& f ) 
  -> typename std::result_of<F&&(T&)>::type
  {
    auto l = lock();
    return std::forward<F>(f)(data);
  }
  mutex_protected(mutex_protected&&)=delete;
  mutex_protected& operator=(mutex_protected&&)=delete;

  template<class...Args>
  mutex_protected( Args&&...args ):
    data( std::forward<Args>(args)... )
  {}
private:
  mutable M m;
  T data;

  U<M> lock() { return U<M>(m); }
  S<M> lock() const { return S<M>(m); }
};

it, especially in c++14 , lets you interact with a mutex protected instance of data in an easy to write way.

In c++14 you can use std::shared_timed_mutex and in c++17 you can use std::shared_mutex like this:

template<class T>
using rw_guarded = mutex_guarded< T, std::shared_mutex, std::shared_lock >;

this enables the ability to have many readers at once. But you should first determine if the simple mutex one is fast enough.

struct cache {
  using Key=std::string;
  using Value=int;
  using Map=std::unordered_map< Key, Value >;
  Value get( Key const& k ) {
    Value* r = table.read([&](Map const& m)->Value*{
      auto it = m.find(k);
      if (it == m.end()) return nullptr;
      return std::addressof( it->second );
    });
    if (r) return *r;
    return table.write([&](Map& m)->Value{
      auto it = m.find(k);
      if (it != m.end()) return it->second;
      auto r = m.insert( std::make_pair(k, 42) ); // construct data here
      return r.first->second;
    });
  }
private:
  mutex_guarded< std::unordered_map< Key, Value > > table;
};

Upgrade mutex_guarded to rw_guarded and it switches to reader-writer locks.

Here is a more complex version:

Have two maps; one to value, one to shared future of value.

Use a reader writer lock (aka shared mutex).

To get, get a shared lock. Check if it is there. If it is, return.

Unlock first map. Lock second map for writing. If there isn't already a shared future under the key, add one. Unlock map 2, and wait on shared future regardless of if you added it.

When done, lock first map for reading; check if result already there. If yes, return it. If not, unlock, relock for writing, move data into map 1 if not already there, return data in first map.

This is designed to minimize the period map 1 is locked for exclusively, allowing max concurrency there.

Other designs will optimize other considerations.

Do not use operator[] . Do not interact with any map without a lock of some kind active. Know which locks correspond to which map. Note that reading elements (not looking up) can be done without a lock in some cases. Sometimes reading copies of shared things is required, not the shared thing. Look up the docs of every type to determine what operations need what locks.

Multithreading unordered_map

Question

1 answers

solution1
2 ACCPTED 2018-03-12 17:18:09

Multithreading unordered_map

Question

1 answers

solution1 2 ACCPTED 2018-03-12 17:18:09

solution1
2 ACCPTED 2018-03-12 17:18:09