简体   繁体   English

为什么C中2线程的执行比1线程慢?

[英]Why is the execution of 2 threads slower than that of 1 thread in C?

I am using the pthread library to make a program to find the accurate value of pi with Leibniz formula.我正在使用pthread库制作一个程序,用莱布尼茨公式找到 pi 的准确值。 I am working on shared resource here.我正在这里处理共享资源。 My multithreaded function looks like this:我的多线程 function 看起来像这样:

void *Leibniz(void *vars)
{
    struct variables *val = (struct variables *)vars;
    int startLimit = val->start;
    int endLimit = val->end;
    
    int i;
    for (i = startLimit; i <= endLimit; i++)
    {
        pthread_mutex_lock(&mutex);
        sum += (pow(-1, i) / ((2 * i) + 1));
        pthread_mutex_unlock(&mutex);
    }
}

When I run the program with N iterations and 1 thread, I get the correct output in about 4.5 seconds average.当我使用N次迭代和 1 个线程运行程序时,我在大约 4.5 秒内得到正确的 output。 When I run the same program with two threads, it takes around 18 seconds.当我用两个线程运行同一个程序时,大约需要 18 秒。 I have to use multithreading to make the program faster but the exact opposite is happening.我必须使用多线程来使程序更快,但恰恰相反。 Can anyone explain why?谁能解释为什么?

You use locks to ensure that sum += (pow(-1, i) / ((2 * i) + 1));您使用锁来确保sum += (pow(-1, i) / ((2 * i) + 1)); is calculated in exactly one thread at a time.一次只在一个线程中计算。 Multi threading can potentially be faster only when multiple threads do work at the same time.只有当多个线程同时工作时,多线程可能会更快。

Mutexes and thread creation itself are costly which is why the multi threaded non-parallel program is slower than single threaded one.互斥锁和线程创建本身成本很高,这就是为什么多线程非并行程序比单线程程序慢的原因。

What is your proposed solution?您提出的解决方案是什么?

Don't have shared resources.没有共享资源。 In this case, have separate sum for each thread, then sum the sums in the end.在这种情况下,每个线程都有单独的总和,然后在最后求和。 Divide and conquer.分而治之。

It looks like you aren't expressing what you thought.看起来你没有表达你的想法。

Instead of locking on each loop iteration (which degrades performance due to many context switches), what you probably wanted is updating the sum at the end of calculation:您可能想要的是在计算结束时更新总和,而不是锁定每个循环迭代(由于许多上下文切换而降低了性能):

(note: only 1 lock needed when updating the shared sum): (注意:更新共享总和时只需要 1 个锁):

{
    struct variables *val = (struct variables *)vars;
    int startLimit = val->start;
    int endLimit = val->end;

    // note: this part is calculated by each thread separatly
    // no (expensive) locking here

    double local_sum = 0;
    for (int i = startLimit; i <= endLimit; i++)
    {
         local_sum += (pow(-1, i) / ((2 * i) + 1));
    }

    // now update the sum (which is shared among all threads)
    // so need some kind of synchronization here
    // -> only 1 lock instead of (endLimit - startLimit + 1) times locking
    pthread_mutex_lock(&mutex);
    sum += local_sum;
    pthread_mutex_unlock(&mutex);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM