简体   繁体   English

C++中的并发线程使用<thread>

[英]Simultaneous Threads in C++ using <thread>

I have been looking around and I am not sure why is this happening.我一直在环顾四周,我不确定为什么会发生这种情况。 I've seen lots of Tuts related to using threads on Linux but not much on what I am sharing right now.我已经看到很多与在 Linux 上使用线程相关的 Tut,但关于我现在分享的内容并不多。

Code:代码:

int j = 0;
while(j <= 10)
{
    myThreads[j] = std::thread(task, j);
    myThreads[j].join();
    j+=1;
}

So I am simply trying to create 10 threads and execute them all.所以我只是想创建 10 个线程并全部执行它们。 The task is pretty simple and it's been dealt with pretty well but the problem is that not the whole threads are being executed.任务非常简单,并且处理得很好,但问题是并非整个线程都在执行。

It's executing only 1 thread and it's waiting for it to finish then executing the other one etc...它只执行 1 个线程,它正在等待它完成然后执行另一个等等......

PS: I know that the main function will quit after activating those threads but I read about this and I am sure I can fix it in many ways. PS:我知道激活这些线程后 main 函数会退出,但我读到了这个,我相信我可以通过多种方式修复它。

So I want to execute all those threads simultaneously.所以我想同时执行所有这些线程。

Thanks a lot in advance, MarioAda.非常感谢,MarioAda。

You are starting threads and then joining them right away.您正在启动线程,然后立即加入它们。 You need to create, do your work and only then join in some other loop.您需要创建,完成您的工作,然后才加入其他一些循环。 Besides, you generally put the threads in a vector so you can reference/join them (which you seem to be doing, although in an array, since this is tagged C++, I encourage you to use a std::vector instead).此外,您通常将线程放在一个向量中,以便您可以引用/加入它们(您似乎正在这样做,尽管在数组中,因为这是标记为 C++,我鼓励您改用std::vector )。

The strategy is the same as with pthreads before it: your declare an array of threads, push them to run, and then join.策略与之前的pthreads相同:您声明一个线程数组,将它们推送到运行,然后加入。

The code below is from here .下面的代码来自这里

#include <thread>
#include <iostream>
#include <vector>

void hello(){
    std::cout << "Hello from thread " << std::this_thread::get_id() << std::endl;
}

int main(){
    std::vector<std::thread> threads;

    for(int i = 0; i < 5; ++i){
        threads.push_back(std::thread(hello));
    }

    for(auto& thread : threads){
        thread.join();
    }

    return 0;
}

That's because join blocks the current thread until your thread has finished.那是因为join 会阻塞当前线程,直到您的线程完成。 You should only start your threads in the loop you already have, and call the join() function of the threads in a second loop.您应该只在已有的循环中启动线程,并在第二个循环中调用线程的 join() 函数。

There's a bit more advanced technique for making such threads run even more simultaneously.有一些更高级的技术可以让这些线程更同时运行。

The problem with the naive approach is, the threads created in the beginning have too much time to run their functions before the last threads are even created.天真的方法的问题是,在创建最后一个线程之前,开始创建的线程有太多时间来运行它们的函数。 So, when the last threads are just created, the first ones have already executed significant parts of their functions.因此,当刚刚创建最后一个线程时,第一个线程已经执行了其功能的重要部分。

In order to avoid that, we can use a counter (protected by a mutex) and a condition variable.为了避免这种情况,我们可以使用计数器(受互斥锁保护)和条件变量。 Each thread that has been created and is now ready to start running its internal function, will increment the counter and check if it has become equal to the total number of threads (ie, if this thread was the last one to increment the counter).已创建并准备开始运行其内部函数的每个线程将增加计数器并检查它是否等于线程总数(即,该线程是否是最后一个增加计数器的线程)。 If it was, it will notify all the other threads (using the condition variable) that it's time to start.如果是,它将通知所有其他线程(使用条件变量)该启动了。 Otherwise, it will wait on the condition variable until some other thread sets the counter to their total number and notifies the remaining threads (including this one).否则,它将等待条件变量,直到某个其他线程将计数器设置为其总数并通知其余线程(包括该线程)。

This way, all the threads will start (almost) simultaneously, only after each and every one of them has been created and is actually ready to execute its function.这样,所有线程将(几乎)同时启动,只有在它们中的每一个都已创建并实际上准备好执行其功能之后。

Here is my implementation of a class ConcurrentRunner which does that.这是我实现的类ConcurrentRunner

First, a C++11-compliant simplified version that will be easier to understand:首先,一个更容易理解的符合 C++11 的简化版本:

#include <mutex>
#include <condition_variable>
#include <vector>
#include <functional>
#include <thread>

// Object that runs multiple functions, each in its own thread, starting them as simultaneously as possible.
class ConcurrentRunner final
{
public:
    template<typename... BackgroundThreadsFunctions>
    explicit ConcurrentRunner(const std::function<void()>& this_thread_function, const BackgroundThreadsFunctions&... background_threads_functions)
        : _this_thread_function{this_thread_function}
        , _num_threads_total{1 + sizeof...(BackgroundThreadsFunctions)}
    {
        this->PrepareBackgroundThreads({ background_threads_functions... });
    }

    ConcurrentRunner(const ConcurrentRunner&) = delete;
    ConcurrentRunner& operator=(const ConcurrentRunner&) = delete;

    // Executes `ThreadProc` for this thread's function and waits for all of the background threads to finish.
    void Run()
    {
        this->ThreadProc(_this_thread_function);

        for (auto& background_thread : _background_threads)
            background_thread.join();
    }

private:
    // Creates the background threads: each of them will execute `ThreadProc` with its respective function.
    void PrepareBackgroundThreads(const std::vector<std::function<void()>>& background_threads_functions)
    {
        // Iterate through the vector of the background threads' functions and create a new thread with `ThreadProc` for each of them.
        _background_threads.reserve(background_threads_functions.size());
        for (const auto& background_thread_function : background_threads_functions)
        {
            _background_threads.emplace_back([this, function = background_thread_function]()
            {
                this->ThreadProc(function);
            });
        }
    }

    // Procedure that will be executed by each thread, including the "main" thread and all background ones.
    void ThreadProc(const std::function<void()>& function)
    {
        // Increment the `_num_threads_waiting_for_start_signal` while the mutex is locked, thus signalizing that a new thread is ready to start.
        std::unique_lock<std::mutex> lock{_mutex};
        ++_num_threads_waiting_for_start_signal;
        const bool ready_to_go = (_num_threads_waiting_for_start_signal == _num_threads_total);
        lock.unlock();

        if (ready_to_go)
        {
            // If this thread was the last one of the threads which must start simultaneously, notify all other threads that they are ready to start.
            _cv.notify_all();
        }
        else
        {
            // If this thread was not the last one of the threads which must start simultaneously, wait on `_cv` until all other threads are ready.
            lock.lock();
            _cv.wait(lock, [this]()
                     {
                         return (_num_threads_waiting_for_start_signal == _num_threads_total);
                     });
            lock.unlock();
        }

        // Execute this thread's internal function.
        function();
    }

private:
    std::function<void()> _this_thread_function;
    std::vector<std::thread> _background_threads;

    const unsigned int _num_threads_total;
    unsigned int _num_threads_waiting_for_start_signal{0}; // counter of the threads which are ready to start running their functions
    mutable std::mutex _mutex; // mutex that protects the counter
    std::condition_variable _cv; // waited on by all threads but the last one; notified when the last thread increments the counter
};

//---------------------------------------------------------------------------------------------------------------------------------------------------
// Example of usage:

#include <atomic>

int main()
{
    std::atomic<int> x{0};

    {
        ConcurrentRunner runner{[&]() { x += 1; }, [&]() { x += 10; }, [&]() { x += 100; }};
        runner.Run();
    }

    return (x.load() == 111) ? 0 : -1;
}

And now the same logic with more templates, less allocations, no unnecessary copies and type erasure, but somewhat harder to read (requires C++17):现在使用更多模板、更少分配、没有不必要的副本和类型擦除的相同逻辑,但有点难以阅读(需要 C++17):

//---------------------------------------------------------------------------------------------------------------------------------------------------
// Helper template `ForEachTupleElement` (meant to be in some other header file).

#include <tuple>
#include <type_traits>
#include <utility>

namespace Detail
{
    template<typename Tuple, typename Function, std::size_t... I>
    constexpr void ForEachTupleElement(Tuple&& tuple, Function function, std::index_sequence<I...>)
    {
        int dummy[] = { 0, (((void)(function(std::get<I>(std::forward<Tuple>(tuple))))), 0)... };
        (void)dummy;
    }
}

// Applies a given function (typically - with a template operator(), e.g., a generic lambda) to each element of a tuple.
template<typename Tuple, typename Function, std::size_t... I>
constexpr void ForEachTupleElement(Tuple&& tuple, Function function)
{
    Detail::ForEachTupleElement(std::forward<Tuple>(tuple), function,
                                std::make_index_sequence<std::tuple_size_v<std::remove_cv_t<std::remove_reference_t<Tuple>>>>{});
}

//---------------------------------------------------------------------------------------------------------------------------------------------------

#include <mutex>
#include <condition_variable>
#include <array>
#include <thread>
#include <tuple>
#include <type_traits>
#include <utility>

// Common non-template part of the `ConcurrentRunner` implementation.
class ConcurrentRunnerBase
{
protected:
    inline ConcurrentRunnerBase() = default;
    inline ~ConcurrentRunnerBase() = default;

protected:
    unsigned int _num_threads_waiting_for_start_signal{0}; // protected by `mutex`
    mutable std::mutex _mutex;
    std::condition_variable _cv; // waited on by all threads but the last one; notified when the last thread increments the counter
};

// Object that runs multiple functions, each in its own thread, starting them as simultaneously as possible.
template<typename ThisThreadFunction, std::size_t NumberOfBackgroundThreads>
class ConcurrentRunner final : private ConcurrentRunnerBase
{
public:
    template<typename ThisThreadFunctionArg, typename... BackgroundThreadsFunctions>
    explicit ConcurrentRunner(ThisThreadFunctionArg&& this_thread_function, BackgroundThreadsFunctions&&... background_threads_functions)
        : _this_thread_function{std::forward<ThisThreadFunctionArg>(this_thread_function)}
    {
        static_assert(sizeof...(BackgroundThreadsFunctions) == NumberOfBackgroundThreads);
        this->Prepare(std::forward<BackgroundThreadsFunctions>(background_threads_functions)...);
    }

    ConcurrentRunner(const ConcurrentRunner&) = delete;
    ConcurrentRunner& operator=(const ConcurrentRunner&) = delete;

    // Executes `ThreadProc` for this thread's function and waits for all of the background threads to finish.
    void Run()
    {
        this->ThreadProc(std::move(_this_thread_function));

        for (auto& background_thread : _background_threads)
            background_thread.join();
    }

private:
    // Creates the background threads: each of them will execute `ThreadProc` with its respective function.
    template<typename... BackgroundThreadsFunctions>
    void Prepare(BackgroundThreadsFunctions&&... background_threads_functions)
    {
        // Copies of the argument functions (created by move constructors where possible), collected in a tuple.
        std::tuple<std::decay_t<BackgroundThreadsFunctions>...> background_threads_functions_tuple{
            std::forward<BackgroundThreadsFunctions>(background_threads_functions)...
        };

        // Iterate through the tuple of the background threads' functions and create a new thread with `ThreadProc` for each of them.
        unsigned int index_in_array = 0;
        ForEachTupleElement(std::move(background_threads_functions_tuple), [this, &index_in_array](auto&& function)
                            {
                                auto i = index_in_array++;
                                _background_threads[i] = std::thread{[this, function = std::move(function)]() mutable
                                {
                                    this->ThreadProc(std::move(function));
                                }};
                            });
    }

    // Procedure that will be executed by each thread, including the "main" thread and all background ones.
    template<typename Function>
    void ThreadProc(Function&& function)
    {
        // Increment the `_num_threads_waiting_for_start_signal` while the mutex is locked, thus signalizing that a new thread is ready to start.
        std::unique_lock lock{_mutex};
        ++_num_threads_waiting_for_start_signal;
        const bool ready_to_go = (_num_threads_waiting_for_start_signal == (1 + NumberOfBackgroundThreads));
        lock.unlock();

        if (ready_to_go)
        {
            // If this thread was the last one of the threads which must start simultaneously, notify all other threads that they are ready to start.
            _cv.notify_all();
        }
        else
        {
            // If this thread was not the last one of the threads which must start simultaneously, wait on `_cv` until all other threads are ready.
            lock.lock();
            _cv.wait(lock, [this]() noexcept -> bool
                     {
                         return (_num_threads_waiting_for_start_signal == (1 + NumberOfBackgroundThreads));
                     });
            lock.unlock();
        }

        // Execute this thread's internal function.
        std::forward<Function>(function)();
    }

private:
    ThisThreadFunction _this_thread_function;
    std::array<std::thread, NumberOfBackgroundThreads> _background_threads;
};

template<typename T, typename... U>
ConcurrentRunner(T&&, U&&...) -> ConcurrentRunner<std::decay_t<T>, sizeof...(U)>;

//---------------------------------------------------------------------------------------------------------------------------------------------------
// Example of usage:

#include <atomic>

int main()
{
    std::atomic<int> x{0};

    {
        ConcurrentRunner runner{[&]() { x += 1; }, [&]() { x += 10; }, [&]() { x += 100; }};
        runner.Run();
    }

    return (x.load() == 111) ? 0 : -1;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM