[英]why single thread is faster than multithread in Ubuntu using pthread?
XUbuntu 14.04, 2 processors. XUbuntu 14.04,2个处理器。
Multi-threads costs 0.8s while single thread costs only 0.4s. 多线程成本为0.8s,而单线程成本仅为0.4s。
If MULTI_THREAD
is defined,then the program will run in a single thread. 如果定义了
MULTI_THREAD
,则程序将在单个线程中运行。 Otherwise, it's a Multi-thread program 否则,它是一个多线程程序
what's wrong? 怎么了?
----------------code------------------------------
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MULTI_THREAD
#define NUM 10000
#define SEM_M 10
int arr[NUM];
FILE *f;
typedef struct _SemData{
sem_t sem_full;
sem_t sem_empty;
}SemData;
void InitSemData(SemData *sd){
sem_init(&sd->sem_full,0,0);
sem_init(&sd->sem_empty,0,SEM_M);
}
void DestroySemData(SemData *sd){
sem_destroy(&sd->sem_full);
sem_destroy(&sd->sem_empty);
}
void *Produce(void* data){
#ifdef MULTI_THREAD
SemData* psd=(SemData*)data;
#endif
int i;
for(i=0;i<NUM;++i){
#ifdef MULTI_THREAD
sem_wait(&psd->sem_empty);
#endif
arr[i]=i;
fprintf(f,"produce:%d\n",arr[i]);
#ifdef MULTI_THREAD
sem_post(&psd->sem_full);
#endif
}
}
void *Custom(void* data){
#ifdef MULTI_THREAD
SemData* psd=(SemData*)data;
#endif
int i,j;
for(i=0;i<NUM;++i){
#ifdef MULTI_THREAD
sem_wait(&psd->sem_full);
#endif
int tmp=0;
for(j=0;j<NUM;++j){
tmp+=arr[i];
}
arr[i]=tmp;
fprintf(f,"Custom:%d\n",arr[i]);
#ifdef MULTI_THREAD
sem_post(&psd->sem_empty);
#endif
}
}
void main(){
f=fopen("b.txt","w");
clock_t start=clock();
#ifdef MULTI_THREAD
SemData sd;
InitSemData(&sd);
pthread_t th0,th1;
pthread_create(&th0,NULL,Produce,(void*)&sd);
pthread_create(&th1,NULL,Custom,(void*)&sd);
pthread_join(th0,NULL);
pthread_join(th1,NULL);
DestroySemData(&sd);
#else
Produce(NULL);
Custom(NULL);
#endif
printf("TotalTime:%fs\n",((float)(clock()-start))/CLOCKS_PER_SEC);
fclose(f);
}
In general parallelization brings additional costs. 通常,并行化会带来额外的成本。 You have to communicate for distributing and collecting data.
您必须进行通信才能分发和收集数据。 Additionally synchronizing can be very expensive.
另外,同步可能非常昂贵。
Your single-threaded code works like this: 您的单线程代码如下所示:
The multi-threaded code works like this: 多线程代码的工作方式如下:
Producer Consumer
-------- --------
Produce one number Wait for a number to be produced
Wait for a number to be consumed Consume one number
Produce one number Wait for a number to be produced
Wait for a number to be consumed Consume one number
Produce one number Wait for a number to be produced
Wait for a number to be consumed Consume one number
...
As you can see, only one thread at a time is actually doing anything. 如您所见,一次仅一个线程实际上在做任何事情。
If there were no overhead to signalling and context switching this would take approximately the same time as the single-threaded code, but since signals and context switches are pretty expensive it's much slower. 如果没有信令和上下文切换的开销,那么这将与单线程代码花费大约相同的时间,但是由于信号和上下文切换非常昂贵,因此速度要慢得多。
Even if you rewrote your multi-threaded code to first produce all the numbers and then consume them, it would be slower, because that would be exactly the same as the single-threaded code plus the signals and context-switches. 即使重写了多线程代码以首先生成所有数字然后再使用它们,它也会变慢,因为这与单线程代码以及信号和上下文切换完全相同。
The algorithm you are testing doesn't need to be broken in multiple threads to be more efficient: you must consider that there's always an overhead in creating a new thread (ie allocating some resources, waiting for sync, etc). 您要测试的算法不必为了提高效率而在多个线程中中断:您必须考虑创建新线程始终会产生开销(即分配一些资源,等待同步等)。 You must evaluate the trade off between new thread creation overhead and single thread complexity.
您必须评估新线程创建开销和单线程复杂性之间的权衡。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.