简体   繁体   中英

Performance degrades for more than 6 Threads

I wrote a multi-threaded code in C++ on Ubuntu 19 based server. The server has 8 cores, 2 threads per core and 126 GB of available memory. Each thread in C++ code will do some processing independently and then write to a single file. As the file is a shared resource, it's been accessed through a mutex. The processing done by each thread is computationally expensive and takes hours (8 hours in case of 6 threads) to execute before it writes the result to the shared file.

If I create six threads, the execution is the fastest. If I create more than six threads it takes more time to execute. Here is the code, I used for creating threads, I used default parameters. Is that causing some problem?

pthread_create(&id[i],NULL,&myFunc, (void*)&param[i])==-1

Six Threads running of C++ code. CPUs are fully utilized on which the thread is running

Sixteen Threads running of C++ code. CPU is not fully utilized as the kernel process (red bars) is utilizing more percentage than the user threads (green bars). It is the worst-case scenario, but even if I run 7 threads the performance is degraded considerably and I can see some red bars

In another experiment, I created two processes containing 6 threads each and ran them simultaneously on the server. In total twelve threads were running within two processes. It executes without any problem (No red bars).

To sum up, I am unable to understand, why more than 6 threads are causing trouble when there are resources available. While on the other hand, two processes each containing six threads execute without any problem.

  • Having more threads than cores is counter-productive as you lose the CPU locality - ie the threads start constantly moving from one CPU to another CPU. Linux tries its best to conserve the locality and if the cores are available, you are sure to execute on the same one as you did before. But if threads are constantly waiting one another, you lose this optimization.

  • CPU cores with 2 threads is a somewhat fraudulent marketing. You still have only a fixed number of execution units and those 2 threads compete for them. In the very rare case where one thread is doing only add/multiply and another one is doing only load/store, you could probably get some benefit. But most of the time those 2 threads won't run faster then if they were waiting on each other.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM