简体   繁体   English

使用 pthread_cond_wait 等待的线程在收到信号后需要多长时间才能唤醒? 我如何估计这个时间?

[英]How much time it takes for a thread waiting with pthread_cond_wait to wake after being signaled? how can I estimate this time?

I'm writing a C++ ThreadPool implantation and using pthread_cond_wait in my worker's main function. I was wondering how much time will pass from signaling the condition variable until the thread/threads waiting on it will wake up.我正在编写一个 C++ 线程池植入,并在我的工作人员的主要 function 中使用 pthread_cond_wait。我想知道从发出条件变量信号到等待它的线程唤醒需要多长时间。 do you have any idea of how can I estimate/calculate this time?你知道我如何估计/计算这次时间吗?

Thank you very much.非常感谢你。

It depends, on the cost of a context switch这取决于上下文切换的成本

  1. on the OS,在操作系统上,
  2. The CPU中央处理器
  3. is it thread or a different process是线程还是不同的进程
  4. the load of the machine机器的负载
  5. Is the switch to same core as it last ran on是否切换到与上次运行时相同的核心
  6. what is the working set size什么是工作集大小
  7. time since it last ran自上次运行以来的时间

Linux best case, i7, 1100ns, thread in same process, same core as it ran in last, ran as the last thread, no load, working set 1 byte. Linux 最佳情况,i7,1100ns,同一进程中的线程,与上次运行的核心相同,作为最后一个线程运行,无负载,工作集 1 字节。

Bad case, flushed from cache, different core, different process, just expect 30µs of CPU overhead.糟糕的情况,从缓存中刷新,不同的核心,不同的进程,预计 CPU 开销仅为 30µs。

Where does the cost go:哪里的费用go:

  1. Save last process context 70-400 cycles,保存最后一个进程上下文 70-400 个周期,
  2. load new context 100-400 cycles加载新上下文 100-400 个周期
  3. if different process, flush TLB, reload 3 to 5 page walks, which potentially could be from memory taking ~300 cycles each.如果不同的进程,刷新 TLB,重新加载 3 到 5 个页面遍历,这可能来自 memory,每个需要大约 300 个周期。 Plus a few page walks if more than one page is touched, including instructions and data.如果触摸了不止一页,还需要浏览几页,包括说明和数据。
  4. OS overhead, we all like the nice statistics, for example add 1 to context switch counter.操作系统开销,我们都喜欢漂亮的统计数据,例如将上下文切换计数器加 1。
  5. Scheduling overhead, which task to run next调度开销,接下来运行哪个任务
  6. potential cache misses on new core ~12 cycles per cache line on own L2 cache, and downhill from there the farther away the data is and the more there is of it.新核心上的潜在缓存未命中在自己的 L2 缓存上每个缓存行约 12 个周期,并且从那里下坡数据越远,数据越多。

As mentioned time for condition variable to react depends on many factors.如前所述,条件变量的反应时间取决于许多因素。 One option is to actually measure it: you may start a thread that waits on a condition variable.一种选择是实际测量它:您可以启动一个等待条件变量的线程。 Then, another thread that signals the condition variable takes timestamp right before signaling the variable.然后,另一个向条件变量发出信号的线程在向变量发出信号之前获取时间戳。 The thread that waits on the variable also takes timestamp the moment it wakes up.等待变量的线程在它醒来的那一刻也会获取时间戳。 Simple as that.就那么简单。 This way you may have rough approximation about time it takes for the thread to notice the signaled condition.通过这种方式,您可以粗略估计线程注意到信号状态所需的时间。

#include <mutex>
#include <condition_variable>
#include <thread>
#include <chrono>
#include <stdio.h>

typedef std::chrono::time_point<std::chrono::high_resolution_clock> timep;

int main()
{
    std::mutex mx;
    std::condition_variable cv;
    timep t0, t1;
    bool done = false;

    std::thread th([&]() {
        while (!done)
        {
            std::unique_lock lock(mx);
            cv.wait(lock);
            t1 = std::chrono::high_resolution_clock::now();
        }
    });

    for (int i = 0; i < 25; ++i) // measure 25 times
    {
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
        t0 = std::chrono::high_resolution_clock::now();
        cv.notify_one();
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
        std::unique_lock lock(mx);
        printf("test#%-2d: cv reaction time: %6.3f micro\n", i,
            1000000 * std::chrono::duration<double>(t1 - t0).count());
    }
    {
        std::unique_lock lock(mx);
        done = true;
    }
    cv.notify_one();
    th.join();
}

Try it on coliru , it produced this output:在 coliru 上试试,它产生了这个 output:

test#0 : cv reaction time: 50.488 micro
test#1 : cv reaction time: 55.057 micro
test#2 : cv reaction time: 53.765 micro
test#3 : cv reaction time: 50.973 micro
test#4 : cv reaction time: 51.015 micro
test#5 : cv reaction time: 57.166 micro
and so on...

On my windows 11 laptop I got values roughly 5-10x faster (5-10 microseconds).在我的 windows 11 笔记本电脑上,我得到的值大约快 5-10 倍(5-10 微秒)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM