与简单地创建线程相比，使用线程池是否有性能优势？

Question

我想知道与简单地创建线程并允许操作系统对它们进行排队和调度相比，使用线程池是否有性能优势。

假设我有 20 个可用线程，并且我有 60 个要在这些线程上运行的任务，比如说我有类似的东西；

void someTask() {

  //...performs some task

}

// say std::thread::hardware_concurrency() = 20

std::vector<std::thread> threads;
for (int i = 0; i < 60; i++) {
  threads.push_back(std::thread(someFunc));
}

std::for_each(threads.begin(),threads.end(),[](std::thread& x){x.join();});

相反，创建一个具有 20 个线程的池并在线程空闲时为每个线程分配另一个“任务”是否有好处？ 我假设生成线程有一些开销，但是为这样的问题创建池还有其他好处吗？

Answer 1

创建一个线程通常需要 75k 周期（~20us）。

启动所述线程可能需要 200k 周期（~60us）。

唤醒一个线程大约需要 15k 个周期（~5us）。

因此，您可以看到值得预先创建线程并唤醒它们而不是每次都创建线程。

#include <iostream>
#include <thread>
#include <cstdint>
#include <mutex>
#include <chrono>
#include <condition_variable>

uint64_t now() {
    return __builtin_ia32_rdtsc();
}

uint64_t t0 = 0;
uint64_t t1 = 0;
uint64_t t2 = 0;
uint64_t t3 = 0;
uint64_t t4 = 0;
double sum01 = 0;
double sum02 = 0;
double sum34 = 0;
uint64_t count = 0;
std::mutex m;
std::condition_variable cv;

void run() {
    t1 = now();
    cv.notify_one();
    std::unique_lock<std::mutex> lk(m);
    cv.wait(lk);
    t4 = now();
}

void create_thread() {
    t0 = now();
    std::thread th( run );
    t2 = now();
    std::this_thread::sleep_for( std::chrono::microseconds(100));
    t3 = now();
    cv.notify_one();
    th.join();
    count++;
    sum01 += (t1-t0);
    sum02 += (t2-t0);
    sum34 += (t4-t3);
}

int main() {
    const uint32_t numloops = 10;
    for ( uint32_t j=0; j<numloops; ++j ) {
        create_thread();
    }
    std::cout << "t01:" << sum01/count << std::endl;
    std::cout << "t02:" << sum02/count << std::endl;
    std::cout << "t34:" << sum34/count << std::endl;
}

典型结果：

Program returned: 0
t01:64614.4
t02:54655
t34:15758.4

资料来源： https://godbolt.org/z/recfjKe8x

与简单地创建线程相比，使用线程池是否有性能优势？

问题描述

1 个解决方案

解决方案1
0 2021-12-23 18:48:00

与简单地创建线程相比，使用线程池是否有性能优势？

问题描述

1 个解决方案

解决方案1 0 2021-12-23 18:48:00

解决方案1
0 2021-12-23 18:48:00