pthread_join是一个瓶颈

Question

I have an application where pthread_join is being the bottleneck. 我有一个应用程序，其中pthread_join是瓶颈。 I need help to resolve this problem. 我需要帮助来解决这个问题。

void *calc_corr(void *t) {
         begin = clock();
         // do work
         end = clock();
         duration = (double) (1000*((double)end - (double)begin)/CLOCKS_PER_SEC);
         cout << "Time is "<<duration<<"\t"<<h<<endl;
         pthread_exit(NULL);
}

int main() {
         start_t = clock();

         for (ii=0; ii<16; ii++) 
            pthread_create(&threads.p[ii], NULL, &calc_corr, (void *)ii);

         for (i=0; i<16; i++) 
            pthread_join(threads.p[15-i], NULL);

         stop_t = clock();

         duration2 = (double) (1000*((double)stop_t - (double)start_t)/CLOCKS_PER_SEC);
         cout << "\n Time is "<<duration2<<"\t"<<endl;

         return 0;
}

The time printed in the thread function is in the range of 40ms - 60ms where as the time printed in the main function is in the 650ms - 670ms . 螺纹功能中打印的时间范围为40ms - 60ms ，主要功能中打印的时间为650ms - 670ms 。 The irony is, my serial code runs in 650ms - 670ms time. 具有讽刺意味的是，我的串行代码运行时间为650毫秒 - 670毫秒 。 what can I do to reduce the time taken by pthread_join ? 我该怎么做才能减少pthread_join所花费的时间？

Thanks in advance! 提前致谢！

Answer 1

On Linux, clock() measures the combined CPU time. 在Linux上， clock()测量组合的CPU时间。 It does not measure the wall time. 它不测量墙壁时间。

This is explains why you get ~640 ms = 16 * 40ms . 这就解释了为什么你得到~640 ms = 16 * 40ms 。 (as pointed out in the comments) （正如评论中所指出）

To measure wall time, you should be using something like: 要测量墙壁时间，您应该使用以下内容：

Answer 2

By creating some threads you are adding an overhead to your system: Creation time, scheduling time. 通过创建一些线程，您将为系统增加开销：创建时间，调度时间。 Creating a thread require allocating the stack, etc; 创建线程需要分配堆栈等; scheduling means more context switching. 调度意味着更多上下文切换 Also, pthread_join suspends execution of the calling thread until the target thread terminates . 此外， pthread_join suspends execution of the calling thread until the target thread terminates 。 Which means you want for thread 1 to finish, when he does you are rescheduled as quick as possible but not instantly, then you wait for thread 2, etc... 这意味着您希望线程1完成，当他完成时，您可以尽快重新安排，但不能立即重新安排，然后等待线程2等等...

Now your computer has few cores, like one or 2, and you are creating 16 threads. 现在你的计算机有几个内核，比如一个或两个，你创建了16个线程。 At best 2 threads of your program will run at the same time and just by adding their clock measurements you have something around 400 ms . 最多程序的2个线程将同时运行，只需添加时钟测量值就可以得到大约400 ms 。

Again It depends on lot of things, so I quickly flown over what is happening. 这又取决于很多事情，所以我很快就会发生什么事情。

pthread_join是一个瓶颈

问题描述

2 个解决方案

解决方案1
10 已采纳 2012-01-31 22:32:11

解决方案2
1 2012-01-31 22:50:38

pthread_join是一个瓶颈

问题描述

2 个解决方案

解决方案1 10 已采纳 2012-01-31 22:32:11

解决方案2 1 2012-01-31 22:50:38

解决方案1
10 已采纳 2012-01-31 22:32:11

解决方案2
1 2012-01-31 22:50:38