[英]Pthread is slower than serial version
I have a function that does the rearrangement of some data in a table of N particles and I am trying to convert it to parallel.我有一个 function 对 N 粒子表中的某些数据进行重新排列,我正在尝试将其转换为并行。 The serial version of it is the following:
它的串行版本如下:
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#define DIM 3
void data_rearrangement(float *Y, float *X, unsigned int *permutation_vector, int N){
for(int i=0; i<N; i++){
memcpy(&Y[i*DIM], &X[permutation_vector[i]*DIM], DIM*sizeof(float));
}
}
The parallel version that I have made is the following:我制作的并行版本如下:
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include "utils.h"
#include <pthread.h>
#define DIM 3
struct data{
float *Y;
float *X;
unsigned int *permutation_vector;
int N;
};
void *threaded_data_rearrangement(void *args){
struct data *new_data;
new_data = (struct data *) args;
for(int i=0; i<new_data->N/NUM_THREADS; i++){
memcpy(&new_data->Y[i*DIM], &new_data->X[new_data->permutation_vector[i]*DIM], DIM*sizeof(float));
}
}
void data_rearrangement(float *Y, float *X, unsigned int *permutation_vector, int N){
void *status;
int rc;
struct data data_array[NUM_THREADS];
pthread_t threads[NUM_THREADS];
int k;
for (k=0; k<NUM_THREADS; k++){
data_array[k].Y = Y;
data_array[k].X = X;
data_array[k].permutation_vector = permutation_vector;
data_array[k].N = N;
pthread_create(&threads[k], NULL, threaded_data_rearrangement, (void *)&data_array[k]);
}
for(k=0; k<NUM_THREADS; k++){
rc = pthread_join(threads[k], &status);
if (rc) {
printf("ERROR; return code from pthread_join() is %d\n",rc);
exit(-1);
}
}
}
Can someone explain me, why is the parallel version slower?有人可以解释一下,为什么并行版本更慢? *note that NUM_THREADS has been declared as a global variable in my whole project.
*请注意,在我的整个项目中,NUM_THREADS 已被声明为全局变量。 I can't think of any other ways to change this function to parallel.
我想不出任何其他方法来将这个 function 更改为并行。 Thank you in advance:)
先感谢您:)
I am not sure what you are trying to accomplish (in terms of time) since you did not include a main function. Here I present you a working main function:我不确定你想要完成什么(就时间而言),因为你没有包括一个主要的 function。在这里我向你展示一个工作的主要 function:
#define NUM_THREADS 8
int main() {
// arrays size number (change accordingly)
int sz = 1e7;
int *pv;
float *X,*Y;
float a;
int i;
// create pointer to the arrays
X = (float*)malloc(sz * sizeof(float));
Y = (float*)malloc(sz * sizeof(float));
pv = (int*)malloc(sz * sizeof(int));
// fill with random numbers
for(i=0;i<sz;i++){
a = rand()%100;
X[i] = (float)rand()/(float)(RAND_MAX/a);;
Y[i] = (float)rand()/(float)(RAND_MAX/a);;
pv[i] = rand()%100;
}
// measure serial version execution time
clock_t t;
t = clock();
printf("Timer starts \n");
data_rearrangement(X,Y,pv,sz);
t = clock() - t;
double time_taken = ((double)t)/CLOCKS_PER_SEC; // calculate the elapsed time
printf("Parallel took %f seconds to execute \n", time_taken);
// measure parallel version execution time
clock_t t2;
t2 = clock();
printf("Timer starts \n");
serial_data_rearrangement(X,Y,pv,sz);
t2 = clock() - t2;
time_taken = ((double)t2)/CLOCKS_PER_SEC; // calculate the elapsed time
printf("Serial took %f seconds to execute \n", time_taken);
return 0;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.