'''The original post has been edited'''
How can I make a thread pool for two for loops in C++? I need to run the start_thread function 22 times for each number between 0 and 6. And I will have a flexible number of threads available depending on the machine I am using. How can I create a pool to allocate the free threads to the next of the nested loop?
for (int t=0; t <22; t++){
for(int p=0; p<6; p++){
thread th1(start_thread, p);
thread th2(start_thread, p);
th1.join();
th2.join();
}
}
Not really certain about what you want, but maybe it's something like this.
for (int t=0; t <22; t++){
std::vector<std::thread> th;
for(int p=0; p<6; p++){
th.emplace_back(std::thread(start_thread, p));
}
for(int p=0; p<6; p++){
th[i].join();
}
}
(or maybe permute the two loops)
Edit if you want to control the number of threads
#include <iostream>
#include <thread>
#include <vector>
void
start_thread(int t, int p)
{
std::cout << "th " << t << ' ' << p << '\n';
}
void
join_all(std::vector<std::thread> &th)
{
for(auto &e: th)
{
e.join();
}
th.clear();
}
int
main()
{
std::size_t max_threads=std::thread::hardware_concurrency();
std::vector<std::thread> th;
for(int t=0; t <22; ++t)
{
for(int p=0; p<6; ++p)
{
th.emplace_back(std::thread(start_thread, t, p));
if(size(th)==max_threads)
{
join_all(th);
}
}
}
join_all(th);
return 0;
}
If you don't want dependency on a third-party library, this is pretty simple.
Just create a number of threads you like and let them pick a "job" from some queue.
For example:
#include <iostream>
#include <mutex>
#include <chrono>
#include <vector>
#include <thread>
#include <queue>
void work(int p)
{
// do the "work"
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << p << std::endl;
}
std::mutex m;
std::queue<int> jobs;
void worker()
{
while (true)
{
int job(0);
// sync access to the jobs queue
{
std::lock_guard<std::mutex> l(m);
if (jobs.empty())
return;
job = jobs.front();
jobs.pop();
}
work(job);
}
}
int main()
{
// queue all jobs
for (int t = 0; t < 22; t++) {
for (int p = 0; p < 6; p++) {
jobs.push(p);
}
}
// create reasonable number of threads
static const int n = std::thread::hardware_concurrency();
std::vector<std::thread> threads;
for (int i = 0; i < n; ++i)
threads.emplace_back(std::thread(worker));
// wait for all of them to finish
for (int i = 0; i < n; ++i)
threads[i].join();
}
[ADDED] Obviously, you don't want global variables in your production code; this is simply a demo solution.
Stop trying to code and draw out what you need to do and the pieces you need to have in order to do it.
You need one queue to hold the jobs, one mutex to protect the queue so the threads don't smurf it up with simultaneous accesses, and N threads.
Each thread function is a loop that
In this case I'd keep things simple by exiting the loop and the thread when there are no more jobs in the queue in step 2. In production you'd have the thread block and wait on the queue so it's still available to service jobs added later.
Wrap that up in a class with a function that allows you to add jobs to the queue, a function to start N threads, and a function to join on all of the running threads.
main
defines an instance of the class, feeds in the jobs, starts the thread pool and then blocks on join until everyone's done.
Once you've beaten the design into something you have high confidence does what you need it to do, then you start writing code. Write code, especially multi-threaded code, without a plan and you're in for a lot of debugging and re-writing that usually exceeds the time spent on design by a significant margin.
Since C++17 you can use one of the execution policies for many of the algorithms in the standard library. This can simplify going over a number of work packages greatly. What goes on behind the curtains is usually that it picks threads from a built-in thread pool and distribute work to them efficiently. It usually use just enough™ threads in both Linux and Windows and it'll use all the CPU you've got left (0% idle on all cores when the CPU:s have started spinning at max frequency) - strangely without making neither Linux nor Windows "sluggish".
Here I've used the execution policy std::execution::parallel_policy
(indicated by the std::execution::par
constant). If you can prepare the work that needs to be done and put it in a container, like a std::vector
, it'll be really easy.
#include <algorithm>
#include <chrono>
#include <execution> // std::execution::par
#include <iostream>
// #include <thread> // not needed to run with execuion policies
#include <vector>
struct work_package {
work_package() : payload(co) { ++co; }
int payload;
static int co;
};
int work_package::co = 10;
int main() {
std::vector<work_package> wps(22*6); // 132 work packages
for(const auto& wp : wps) std::cout << wp.payload << '\n'; // prints 10 to 141
// work on the work packages
std::for_each(std::execution::par, wps.begin(), wps.end(), [](auto& wp) {
// Probably in a thread - As long as you do not write to the same work package
// from different threads, you don't need synchronization here.
// do some work with the work package
++wp.payload;
});
for(const auto& wp : wps) std::cout << wp.payload << '\n'; // prints 11 to 142
}
With g++
you may need to install tbb
( The Threading Building Blocks ) that you also need to link with: -ltbb
.
apt install libtbb-dev
on Ubuntu.dnf install tbb-devel.x86_64
on Fedora. Other distributions may call it something different.
Visual Studio (2017 and later) links with the proper library automatically (also tbb
if I'm now mistaken).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.