简体   繁体   English

安全的多线程计数器增量

[英]Safe multi-thread counter increment

For example, I've got a some work that is computed simultaneously by multiple threads. 例如,我有一些工作是由多个线程同时计算的。

For demonstration purposes the work is performed inside a while loop. 出于演示目的,该工作在while循环内执行。 In a single iteration each thread performs its own portion of the work, before the next iteration begins a counter should be incremented once. 在单次迭代中,每个线程执行其自己的工作部分,在下一次迭代开始之前,计数器应增加一次。

My problem is that the counter is updated by each thread. 我的问题是计数器由每个线程更新。

As this seems like a relatively simple thing to want to do, I presume there is a 'best practice' or common way to go about it? 因为这似乎是一件相对简单的事情,所以我认为有一种“最佳实践”或通用方法可以做到吗?

Here is some sample code to illustrate the issue and help the discussion along. 这是一些示例代码来说明问题并帮助进行讨论。 (Im using boost threads) (我使用增强线程)

class someTask {
public:
    int mCounter; //initialized to 0
    int mTotal; //initialized to i.e. 100000
    boost::mutex cntmutex;                
    int getCount()
    {
            boost::mutex::scoped_lock lock( cntmutex );
            return mCount;
    }
    void process( int thread_id, int numThreads )
    {
        while ( getCount() < mTotal )
        {
            // The main task is performed here and is divided 
            // into sub-tasks based on the thread_id and numThreads

                            // Wait for all thread to get to this point

            cntmutex.lock();
            mCounter++;  // < ---- how to ensure this is only updated once?
            cntmutex.unlock();
        }
    }
};

The main problem I see here is that you reason at a too-low level. 我在这里看到的主要问题是您的推理水平太低。 Therefore, I am going to present an alternative solution based on the new C++11 thread API. 因此,我将基于新的C ++ 11线程API提出替代解决方案。

The main idea is that you essentially have a schedule -> dispatch -> do -> collect -> loop routine. 主要思想是,您实际上有一个调度->调度->执行->收集->循环例程。 In your example you try to reason about all this within the do phase which is quite hard. 在您的示例中,您尝试在do阶段中对此进行推理,这非常困难。 Your pattern can be much more easily expressed using the opposite approach. 使用相反的方法可以更轻松地表达您的模式。

First we isolate the work to be done in its own routine: 首先,我们将要完成的工作隔离在自己的例程中:

void process_thread(size_t id, size_t numThreads) {
    // do something
}

Now, we can easily invoke this routine: 现在,我们可以轻松地调用此例程:

#include <future>
#include <thread>
#include <vector>

void process(size_t const total, size_t const numThreads) {
    for (size_t count = 0; count != total; ++count) {
         std::vector< std::future<void> > results;

         // Create all threads, launch the work!
         for (size_t id = 0; id != numThreads; ++id) {
             results.push_back(std::async(process_thread, id, numThreads));
         }

         // The destruction of `std::future`
         // requires waiting for the task to complete (*)
    }
}

(*) See this question . (*)看到这个问题

You can read more about std::async here , and a short introduction is offered here (they appear to be somewhat contradictory on the effect of the launch policy, oh well). 您可以在此处了解更多有关std::async 信息 ,并在此处提供简短的介绍(它们似乎与启动策略的效果有些矛盾,哦)。 It is simpler here to let the implementation decides whether or not to create OS threads: it can adapt depending on the number of available cores. 在这里,让实现决定是否创建OS线程更简单:它可以根据可用核心的数量进行调整。

Note how the code is simplified by removing shared state. 注意如何通过删除共享状态来简化代码。 Because the threads share nothing, we no longer have to worry about synchronization explicitly! 由于线程不共享任何内容,因此我们不必再担心显式同步!

You protected the counter with a mutex, ensuring that no two threads can access the counter at the same time. 您用互斥锁保护了计数器,确保没有两个线程可以同时访问该计数器。 Your other option would be using Boost::atomic , c++11 atomic operations or platform-specific atomic operations. 您的另一个选择是使用Boost :: atomicc ++ 11原子操作或特定于平台的原子操作。

However, your code seems to access mCounter without holding the mutex: 但是,您的代码似乎无需持有互斥mCounter即可访问mCounter

    while ( mCounter < mTotal )

That's a problem. 那是个问题。 You need to hold the mutex to access the shared state. 您需要按住互斥锁才能访问共享状态。

You may prefer to use this idiom: 您可能更喜欢使用以下成语:

  1. Acquire lock. 获取锁。

  2. Do tests and other things to decide whether we need to do work or not. 做测试和其他事情来决定我们是否需要工作。

  3. Adjust accounting to reflect the work we've decided to do. 调整会计以反映我们决定要做的工作。

  4. Release lock. 释放锁。 Do work. 做工作。 Acquire lock. 获取锁。

  5. Adjust accounting to reflect the work we've done. 调整会计以反映我们已完成的工作。

  6. Loop back to step 2 unless we're totally done. 除非我们完全完成,否则请循环回到步骤2。

  7. Release lock. 释放锁。

You need to use a message-passing solution. 您需要使用消息传递解决方案。 This is more easily enabled by libraries like TBB or PPL. 通过TBB或PPL之类的库更容易启用此功能。 PPL is included for free in Visual Studio 2010 and above, and TBB can be downloaded for free under a FOSS licence from Intel. PPL在Visual Studio 2010及更高版本中免费提供,而TBB可以在英特尔的FOSS许可下免费下载。

concurrent_queue<unsigned int> done;
std::vector<Work> work; 
// fill work here
parallel_for(0, work.size(), [&](unsigned int i) {
    processWorkItem(work[i]);
    done.push(i);
});

It's lockless and you can have an external thread monitor the done variable to see how much, and what, has been completed. 它是无锁的,您可以让外部线程监视done变量,以查看已完成的数量和内容。

I would like to disagree with David on doing multiple lock acquisitions to do the work. 我不同意David在进行多次锁获取工作方面的意见。

Mutexes are expensive and with more threads contending for a mutex , it basically falls back to a system call , which results in user space to kernel space context switch along with the with the caller Thread(/s) forced to sleep :Thus a lot of overheads. Mutexes很昂贵,并且有更多线程争用一个mutex ,它基本上会退回到系统调用,这导致用户空间到内核空间的上下文切换以及调用者线程被迫进入睡眠状态,因此很多间接费用。

So If you are using a multiprocessor system , I would strongly recommend using spin locks instead [1]. 因此,如果您使用的是多处理器系统,我强烈建议您使用自旋锁代替[1]。

So what i would do is : 所以我会做的是:

=> Get rid of the scoped lock acquisition to check the condition. =>摆脱范围锁定的获取,以检查条件。

=> Make your counter volatile to support above =>使您的计数器波动以支持以上

=> In the while loop do the condition check again after acquiring the lock. =>在while循环中,获取锁后再次进行条件检查。

class someTask {
 public:
 volatile int mCounter; //initialized to 0       : Make your counter Volatile
 int mTotal; //initialized to i.e. 100000
 boost::mutex cntmutex;                

 void process( int thread_id, int numThreads )
 {
    while ( mCounter < mTotal ) //compare without acquiring lock
    {
        // The main task is performed here and is divided 
        // into sub-tasks based on the thread_id and numThreads

        cntmutex.lock();
        //Now compare again to make sure that the condition still holds
        //This would save all those acquisitions and lock release we did just to 
        //check whther the condition was true.
        if(mCounter < mTotal)
        {
             mCounter++;  
        }

        cntmutex.unlock();
    }
 }
};

[1]http://www.alexonlinux.com/pthread-mutex-vs-pthread-spinlock [1] http://www.alexonlinux.com/pthread-mutex-vs-pthread-spinlock

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM