简体   繁体   English

有效处理并发的stl容器扩展的方法

[英]Ways to handle concurrent stl-container expansions efficiently

I have this cute little problem that I have an STL container (unordered_map, or vector) that may both be read and expanded by any number of threads, and I want to do it as efficiently as possible (ie reduce the lock latency by as much as possible). 我有一个可爱的小问题,我有一个STL容器(unordered_map或vector),它可以被任意数量的线程读取和扩展,并且我想尽可能有效地做到这一点(即,尽可能减少锁定延迟)尽可能)。

Now, the easy part is to use a shared lock for reading and an exclusive lock for extending, so I might be doing something like this: 现在,最简单的部分是使用共享锁进行读取,并使用排他锁进行扩展,因此我可能正在执行以下操作:

boost::shared_lock myLock;
std::unordered_map myMap;

...

myLock.lock_shared();
//try looking up key
myLock.unlock_shared();
if(!success) {
    myLock.lock();
    //retry looking up key
    if(!success) {
        myMap[key] = value;
    }
    myLock.unlock();
}

While I believe this to be rather efficient in the general case, the time the exclusive lock is held may explode if myMap decides that it needs to reallocate its internal storage. 虽然我认为这在一般情况下会非常有效,但是如果myMap决定需要重新分配其内部存储,则持有排他锁的时间可能会myMap The time that lock is held might go up to several hundred microseconds, and it will block any thread that attempts to read the container during this time. 持有锁的时间可能长达数百微秒,并且它将阻止在此期间尝试读取容器的任何线程。

Does anybody know of an idiom that can be used to avoid blocking all reading threads while the storage is reallocated? 有人知道有一种习惯用法可以用来避免在重新分配存储空间时阻塞所有读取线程吗?

Of course, I know the math that tells me that these latencies will not accumulate to bring down overall performance, nevertheless I would be happier if I could limit the possible latencies somehow. 当然,我知道数学告诉我这些延迟不会累积下来,从而降低整体性能,但是如果我能以某种方式限制可能的延迟,我会更高兴。

I can see two possible solutions here. 我可以在这里看到两种可能的解决方案。 But they may be incompatible with other logic of your program. 但是它们可能与程序的其他逻辑不兼容。

1) You can preallocate once (or more rarely than STL does it) enough number of container's elements with vector::reserve() / unordered_map.reserve() . 1)您可以使用vector::reserve() / unordered_map.reserve()一次(或比STL少得多预分配足够数量的容器元素。 If that is suitable for your program then you will avoid further container reallocations or lower their number. 如果这适合您的程序,那么您将避免进一步的容器重新分配或减少其数量。

2) You can use the copy-modify-swap approach. 2)您可以使用复制-修改-交换方法。 It may increase the number of memory allocations and copyings greatly but will also decrease greatly the time when threads are blocked. 它可能会大大增加内存分配和复制的数量,但也会大大减少线程被阻塞的时间。

The code would be something like this: 该代码将是这样的:

atomic<unordered_map*> pmyMap;

void ChangeMap(){
  unordered_map* pOldMap = pmyMap.load();
  unordered_map* pNewMap = 0;
  {
    if (pNewMap != 0) delete pNewMap;
    pNewMap = new std::unordered_map(*pOldMap); // copy
    (*pNewMap)[key] = value; // modify the copy
  }
  while (!pmyMap.compare_exchange_weak(pOldMap, pNewMap)) // swap
}

In that case you can read from the container without the locks at all. 在这种情况下,您可以完全不带锁地从容器中读取内容。 The main problem here is to delete old copy of the container correctly. 这里的主要问题是正确删除容器的旧副本。 You can't delete it simply at the end of ChangeMap() because other thread can read from it (eg iterate it) at the same time. 您不能仅在ChangeMap()的末尾删除它,因为其他线程可以同时从中读取(例如,对其进行迭代)。 To delete the old container correctly you have to track the reading threads in some way and delete the container object only when all reads are finished. 要正确删除旧容器,您必须以某种方式跟踪读取线程,并且仅在完成所有读取后才删除容器对象。

Intel's TBB library is very good. 英特尔的TBB库非常好。 I've used concurrent_hash_map. 我已经使用了current_hash_map。

There are two obvious solutions: reserve enough space before you start, or copy-modify-swap. 有两种明显的解决方案:在开始之前reserve足够的空间,或者进行复制-修改-交换。

Both have significant downsides. 两者都有很大的缺点。 If you don't reserve enough, you end up having to occasionally block everything to re- reserve . 如果您reserve不足,最终将不得不偶尔阻止所有内容以进行reserve And copy-modify-swap is overkill usually. 复制-修改-交换通常是多余的。

So do both. 两者都一样。

Keep track of when you will need a reallocation. 跟踪何时需要重新分配。 When you don't need a reallocation, simply lock and write. 当您不需要重新分配时,只需锁定并写入即可。

When you will need a reallocation, copy, reallocate, store, then lock&swap in. 当您需要重新分配时,请复制,重新分配,存储,然后锁定和交换。

Now things get tricky here. 现在事情变得棘手了。 If two threads try to store we want the second to block. 如果有两个线程试图存储,我们希望第二个线程阻塞。 So we require a two layer system, where writing threads block on the copy-swap if it is happening, and reading threads only block on the lock&swap-back. 因此,我们需要一个两层系统,在这种情况下,写线程会在复制交换上阻塞(如果发生),而读线程只会在锁回交换上阻塞。

boost::shared_lock myLock;
boost::lock lock2;
std::unordered_map myMap;

...

myLock.lock_shared();
//try looking up key
myLock.unlock_shared();
if(!success) {
  lock2.lock(); -- write-exclusive
  if (we have room to add another value) {
    myLock.lock(); -- block-readers
    myMap[key] = value;
    myLock.unlock(); // allow readers back in, time is one write with no realloc
  } else {
    auto myCopy = myMap; // ouch
    myCopy.reserve( 50% more memory, or 2x as much memory, or whatever );
    myCopy[key] = value;
    myLock.lock(); // -- block readers
    using std::swap;
    swap( myCopy, myMap );
    myLock.unlock(); // allow readers back in, time is 1 swap!         
  }
  lock2.unlock();
}

a downside to this is that other writers can have to wait a long time to write if we are doing a realloc. 不利的一面是,如果我们要进行重新分配,则其他编写者可能不得不等待很长时间才能编写。

If you have only one writer thread, lock2 need not exist. 如果只有一个写程序线程,则lock2不必存在。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM