简体   繁体   English

为什么在pthread-win32中主线程比工作线程慢?

[英]Why main thread is slower than worker thread in pthread-win32?

void* worker(void*)
{
    int clk = clock();
    float val = 0;
    for(int i = 0; i != 100000000; ++i)
    {
        val += sin(i);
    }
    printf("val: %f\n", val);
    printf("worker: %d ms\n", clock() - clk);
    return 0;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, NULL, worker, NULL);
    int clk = clock();
    float val = 0;
    for(int i = 0; i != 100000000; ++i)
    {
        val += sin(i);
    }
    printf("val: %f\n", val);
    printf("main: %d ms\n", clock() - clk);
    pthread_join(tid, 0);
    return 0;
}

Main thread and the worker thread are supposed to run equally fast, but the result is: 主线程和工作线程应该运行得一样快,但是结果是:

   val: 0.782206
   worker: 5017 ms
   val: 0.782206
   main: 8252 ms

Main thread is much slower, I don't know why.... 主线程慢得多,我不知道为什么。


Problem solved. 问题解决了。 It's the compiler's problem, GCC(MinGW) behaves weirdly on Windows. 这是编译器的问题,GCC(MinGW)在Windows上表现异常。 I compliled the code in Visual Studio 2012, there's no speed difference. 我在Visual Studio 2012中编译了代码,没有速度差异。

 Main thread and the worker thread are supposed to run equally fast, but the result is:

I have never seen a threading system outside a realtime OS which provided such guarantees. 我从未见过实时操作系统之外的提供此类保证的线程系统。 With windows threads and all other threading systems(I have also use posix threads, and whatever the lightweight threading on MacOS X is, and threads in C# threads) in Desktop systems it is my understanding that there are no performance guarantees in terms or how fast one thread will be in relation to another. 在Windows系统和台式机系统中,使用Windows线程和所有其他线程系统(我也使用posix线程,以及MacOS X上的轻量级线程和C#线程中的任何线程),我的理解是无法保证术语或速度上的性能一个线程将与另一个线程相关。

A possible explanation (speculation) could be that since you are using a modern quadcore it could be raising the clock rate on the main core. 一种可能的解释(推测)可能是由于您使用的是现代四核,因此可能会提高主核的时钟速率。 When there are mostly single threaded workloads modern i5/i7/AMD-FX systems raise the clock rate on one core to a pre-rated level that stock cooling can dissipate the heat for. 当大多数单线程工作负载存在时,现代i5 / i7 / AMD-FX系统会将一个内核的时钟速率提高到预先确定的水平,从而库存冷却可以消散热量。 On more parallel workloads all the cores get a smaller bump in clock speed, again pre-rated based on heat dissipation and when idle all of the cores are throttled down to minimize power usage. 在更多的并行工作负载上,所有内核的时钟速度都有较小的提高,再次基于散热进行了预先评估,空闲时,所有内核均被调低以最大程度地降低功耗。 It is possible that the amount of background work is mostly performed on a single core and the amount of time the second thread spends on the second core is not enough to justify switching to the mode where all the cores speed is boosted. 后台工作量可能主要在单个内核上执行,并且第二个线程在第二个内核上花费的时间不足以证明切换到所有内核速度都得到提高的模式是合理的。

I would try again with 4 threads and 10x the workload. 我会用4个线程重试10倍的工作量。 If you have a tool that monitors CPU load and clock-speeds I would check that. 如果您有监视CPU负载和时钟速度的工具,我会检查一下。 Using that information you can infer if I am right or wrong. 使用该信息,您可以推断我是对还是错。

Another option might be profiling and seeing if what part of the work is taking time. 另一个选择可能是分析并查看工作的哪个部分是否需要时间。 It could be that the OS calls are taking more time than your workload. 可能是OS调用花费的时间超过您的工作量。

You could also test your software on another machine with different performance characteristics such as steady clock-speed or single core. 您还可以在另一台具有不同性能特征的计算机上测试软件,例如稳定的时钟速度或单核。 This would provide more information. 这将提供更多信息。

What could be happening is that the worker thread execution is being interleaved with main's execution, so that some of the worker thread's execution time is being counted against main's time. 可能发生的情况是工作线程的执行与主线程的执行交织在一起,因此一些工作线程的执行时间被计入了主线程的时间。 You could try putting a sleep(10) (some time larger than the run-time of the worker and of main) at the very beginning of the worker and run again. 您可以尝试在工作程序的开始处放置一个sleep(10) (比工作程序和main的运行时间大一些的时间),然后再次运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM