简体   繁体   English

使用std :: unordered_map进行数据竞争,尽管使用互斥锁进行锁定插入

[英]Data race with std::unordered_map, despite locking insertions with mutex

I have a C++11 program that does some computations and uses a std::unordered_map to cache results of those computations. 我有一个C ++ 11程序,它执行一些计算并使用std::unordered_map来缓存那些计算的结果。 The program uses multiple threads and they use a shared unordered_map to store and share the results of the computations. 该程序使用多个线程,并使用共享的unordered_map来存储和共享计算结果。

Based on my reading of unordered_map and STL container specs, as well as unordered_map thread safety , it seems that an unordered_map , shared by multiple threads, can handle one thread writing at a time, but many readers at a time. 基于我对unordered_map和STL容器规范的读取以及unordered_map线程安全性 ,似乎多个线程共享的unordered_map可以一次处理一个线程,但一次只能处理许多读者。

Therefore, I'm using a std::mutex to wrap my insert() calls to the map, so that at most only one thread is inserting at a time. 因此,我使用std::mutex将我的insert()调用包装到地图中,因此一次最多只插入一个线程。

However, my find() calls do not have a mutex as, from my reading, it seems that many threads should be able to read at once. 但是,我的find()调用没有互斥,因为从我的阅读来看,似乎许多线程应该能够立即读取。 However, I'm occasionally getting data races (as detected by TSAN), manifesting themselves in a SEGV. 但是,我偶尔会得到数据竞赛(由TSAN检测到),在SEGV中表现出来。 The data race clearly points to the insert() and find() calls that I mentioned above. 数据竞争明确指向我上面提到的insert()find()调用。

When I wrap the find() calls in a mutex, the problem goes away. 当我在一个互斥锁中包装find()调用时,问题就消失了。 However, I don't want to serialize the concurrent reads, as I'm trying to make this program as fast as possible. 但是,我不想序列化并发读取,因为我试图尽可能快地使这个程序。 (FYI: I'm running using gcc 5.4.) (仅供参考:我正在使用gcc 5.4。)

Why is this happening? 为什么会这样? Is my understanding of the concurrency guarantees of std::unordered_map incorrect? 我对std::unordered_map的并发保证的理解是不正确的?

You still need a mutex for your readers to keep the writers out, but you need a shared one. 你仍然需要一个mutex让你的读者保持作家,但你需要一个共享 C++14 has a std::shared_timed_mutex that you can use along with scoped locks std::unique_lock and std::shared_lock like this: C++14有一个std :: shared_timed_mutex ,你可以使用scoped lock std :: unique_lockstd :: shared_lock,如下所示:

using mutex_type = std::shared_timed_mutex;
using read_only_lock  = std::shared_lock<mutex_type>;
using updatable_lock = std::unique_lock<mutex_type>;

mutex_type mtx;
std::unordered_map<int, std::string> m;

// code to update map
{
    updatable_lock lock(mtx);

    m[1] = "one";
}

// code to read from map
{
    read_only_lock lock(mtx);

    std::cout << m[1] << '\n';
}

There are several problems with that approach. 这种方法存在一些问题。

first, std::unordered_map has two overloads of find - one which is const , and one which is not. 首先, std::unordered_map有两个find重载 - 一个是const ,一个不是。
I'd dare to say that I don't believe that that non-const version of find will mutate the map, but still for the compiler invoking non const method from a multiple threads is a data race and some compilers actually use undefined behavior for nasty optimizations. 我敢说我不相信find非const版本会改变地图,但是仍然因为编译器从多个线程调用非const方法是一个数据竞争,一些编译器实际上使用未定义的行为讨厌的优化。
so first thing - you need to make sure that when multiple threads invoke std::unordered_map::find they do it with the const version. 首先 - 你需要确保当多个线程调用std::unordered_map::find他们使用const版本。 that can be achieved by referencing the map with a const reference and then invoking find from there. 这可以通过使用const引用引用地图然后从那里调用find来实现。

second, you miss the the part that many thread may invoke const find on your map, but other threads can not invoke non const method on the object! 第二,你错过了许多线程可能在你的地图上调用const查找的部分,但是其他线程无法在对象上调用非const方法! I can definitely imagine many threads call find and some call insert on the same time, causing a data race. 我可以想象很多线程同时调用find和一些调用insert ,导致数据竞争。 imagine that, for example, insert makes the map's internal buffer reallocate while some other thread iterates it to find the wanted pair. 想象一下,例如, insert使地图的内部缓冲区重新分配,而其他一些线程迭代它以找到想要的对。

a solution to that is to use C++14 shared_mutex which has an exclusive/shared locking mode. 解决方案是使用C ++ 14 shared_mutex ,它具有独占/共享锁定模式。 when thread call find , it locks the lock on shared mode, when a thread calls insert it locks it on exclusive lock. 当线程调用find ,它锁定共享模式的锁,当一个线程调用insert它将它锁定在独占锁上。

if your compiler does not support shared_mutex , you can use platform specific synchronization objects, like pthread_rwlock_t on Linux and SRWLock on Windows. 如果编译器不支持shared_mutex ,则可以使用特定于平台的同步对象,例如Linux上的pthread_rwlock_t和Windows上的SRWLock

another possibility is to use lock-free hashmap, like the one provided by Intel's thread-building blocks library, or concurrent_map on MSVC concurrency runtime. 另一种可能性是使用无锁HashMap中,就像英特尔线程构建模块库提供的一个或concurrent_map上MSVC并发运行。 the implementation itself uses lock-free algorithms which makes sure access is always thread-safe and fast on the same time. 实现本身使用无锁算法,确保访问始终是线程安全的,同时快速。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM