pthread並發運行線程c

Question

我需要在c中使用pthread制作Leibniz算法，現在我有了這個代碼，但是目前線程實現需要順序實現的同一時間，我認為它不是並發運行的。 有人可以看到錯誤。

謝謝！！

#include<stdio.h>
#include<math.h>
#include<pthread.h>
#include<stdlib.h>
#define NUM_THREADS 2
#define ITERATIONS 100000000
double result = 0.0;
void *leibniz(void *threadid){
  int size = ITERATIONS/NUM_THREADS;
  int start = (long)threadid * size;
  int end = ((long)threadid+1) * size;
  int i;
  for(i = start; i<end; i++){
    int denom = 2*i+1;
    result += pow(-1.0, i) * (1.0/denom);
  }
}

int main(){
  pthread_t threads[NUM_THREADS];
  long t;
  int rc;

  // CREATE
  for(t=0;t<NUM_THREADS;t++){
    rc = pthread_create(&threads[t], NULL, leibniz, (void *)t);
    if(rc){
      printf("ERROR: return code %d\n", rc);
    }
  }
  // JOIN
  for(t=0;t<NUM_THREADS;t++){
    rc = pthread_join(threads[t], NULL);
    if(rc){
      printf("ERROR: return code %d\n", rc);
      exit(-1);
    }
  }
  printf("Pi %f\n", result*4);
  exit(0);

}

感謝Jean-FrançoisFabre，我做了這些改變，現在它可以了！

double result=0.0;

void *leibniz(void *threadid){
  double local = 0.0;
  int size = ITERATIONS/NUM_THREADS;
  int start = (long)threadid * size;
  int end = ((long)threadid+1) * size;
  int i;
  for(i = start; i<end; i++){
    local += (i%2==0 ? 1 : -1) * (1.0/(2*i+1));
  }
  result += local*4;
}

Answer 1

我會嘗試回答。

即使您的應用程序是多線程的，也不能保證每個核心有1個FPU。 我對此知之甚少，但我認為有些AMD處理器實際上在內核之間共享 FPU。

由於你的循環基本上是添加和pow ，它是99％的FPU計算，所以如果FPU在你的計算機上共享，它就解釋了瓶頸。

您可以通過不調用pow來計算-1或1來減少FPU的使用，這將是一個標量操作，並且可能會產生影響。 如果i是奇數，則使用-1 ，否則使用1 ，或者在每次迭代時使用外部1 / -1變量。

另外，為了避免競爭條件，將結果累積到本地結果中，並將其添加到最后（最后通過互斥體保護添加會更好）

double result = 0.0;
void *leibniz(void *threadid){
  double local = 0.0;
  int size = ITERATIONS/NUM_THREADS;
  int start = (long)threadid * size;
  int end = ((long)threadid+1) * size;
  int i;
  for(i = start; i<end; i++){
    int denom = 2*i+1;
    // using a ternary/scalar speeds up the "pow" computation, multithread or not
    local += (i%2 ? -1 : 1) * (1.0/denom);
  }
  // you may want to protect that addition with a pthread_mutex
  // start of critical section
  result += local;
  // end of critical section
}

http://wccftech.com/amd-one-fpu-per-core-design-zen-processors/

Answer 2

我在Windows上運行Visual Studio，並且我沒有安裝pthread，因此我使用Windows線程創建了一個測試程序。 我將計算分為一個計算所有正項的函數和一個計算所有負項的函數。 雙精度不是問題，因為正和<22，負和> -19。

處理器是英特爾3770K 3.5ghz（每個核心都有自己的FPU）。 我測試連續調用兩個函數而不是為第二個函數使用單獨的線程，並且兩個線程的情況是單線程情況的兩倍，單線程~0.360秒，雙線程〜= 0.180秒。

#include <stdio.h>
#include <time.h>
#include <windows.h>

static HANDLE ht1;                      /* thread handle */

static DWORD WINAPI Thread0(LPVOID);    /* thread functions */
static DWORD WINAPI Thread1(LPVOID);

static clock_t ctTimeStart;             /* clock values */
static clock_t ctTimeStop;
static double  dTime;

static double pip;              /* sum of positive terms */
static double pim;              /* sum of negative terms */
static double pi;               /* pi */

int main()
{
    ctTimeStart = clock();
    Thread0(NULL);
    Thread1(NULL);
    ctTimeStop = clock();
    dTime = (double)(ctTimeStop - ctTimeStart) / (double)(CLOCKS_PER_SEC);  
    pip *= 4.;          /* pip <  22 after *= 4. */
    pim *= 4.;          /* pim > -19 after *= 4. */
    pi = pip + pim;
    printf("%.16lf %.16lf %.16lf %2.5lf secs\n", pi, pip, pim, dTime);

    ctTimeStart = clock();
    ht1 = CreateThread(NULL, 0, Thread1, 0, 0, 0);
    Thread0(NULL);
    WaitForSingleObject(ht1, INFINITE); // wait for thead 1
    ctTimeStop = clock();
    dTime = (double)(ctTimeStop - ctTimeStart) / (double)(CLOCKS_PER_SEC);  
    pip *= 4.;          /* pip <  22 after *= 4. */
    pim *= 4.;          /* pim > -19 after *= 4. */
    pi = pip + pim;
    printf("%.16lf %.16lf %.16lf %2.5lf secs\n", pi, pip, pim, dTime);

    CloseHandle(ht1);

    return 0;
}

DWORD WINAPI Thread0(LPVOID lpVoid)
{
double pp = 0.;                         /* local sum */
int j;
    for(j = 200000001; j >= 0; j -= 4)
        pp += 1. / (double)(j);
    pip = pp;                           /* store sum */
    return 0;
}

DWORD WINAPI Thread1(LPVOID lpVoid)
{
double pm = 0.;                         /* local sum */
int j;
    for(j = 200000003; j >= 0; j -= 4)
        pm -= 1. / (double)(j);
    pim = pm;                           /* store sum */
    return 0;
}

pthread並發運行線程c

問題描述

2 個解決方案

解決方案1
3 已采納 2017-03-09 21:26:52

解決方案2
0 2017-03-09 23:19:37

pthread並發運行線程c

問題描述

2 個解決方案

解決方案1 3 已采納 2017-03-09 21:26:52

解決方案2 0 2017-03-09 23:19:37

解決方案1
3 已采納 2017-03-09 21:26:52

解決方案2
0 2017-03-09 23:19:37