[英]Thread not improving the code performance
我正在嘗試將基本的長循環轉換為線程以提高循環性能。
這是線程版本:
#include <iostream>
#include <thread>
#include <chrono>
using namespace std;
using namespace std::chrono;
void funcSum(long long int start, long long int end, long long int *sum)
{
for(auto i = start; i <= end; ++i)
{
*sum += i;
}
}
int main()
{
long long int start = 10, end = 1900000000;
long long int sum = 0;
auto startTime = high_resolution_clock::now();
thread t1(funcSum, start, end / 2, &sum);
thread t2(funcSum, end / 2 + 1 , end, &sum);
t1.join();
t2.join();
auto stopTime = high_resolution_clock::now();
auto duration = duration_cast<seconds>(stopTime - startTime);
cout << "Sum: " << sum << endl;
cout << duration.count() << " Seconds";
return 0;
}
這是正常的代碼(沒有線程):
#include <iostream>
#include <thread>
#include <chrono>
using namespace std;
using namespace std::chrono;
void funcSum(long long int start, long long int end, long long int *sum)
{
for(auto i = start; i <= end; ++i)
{
*sum += i;
}
}
int main()
{
long long int start = 10, end = 1900000000;
long long int sum = 0;
auto startTime = high_resolution_clock::now();
funcSum(start, end, &sum);
auto stopTime = high_resolution_clock::now();
auto duration = duration_cast<seconds>(stopTime - startTime);
cout << "Sum: " << sum << endl;
cout << duration.count() << " Seconds";
return 0;
}
總和:1805000000949999955 5 秒過程完成,退出代碼為 0
在這兩種情況下,花費的時間都是 5 秒。
為什么第一個線程版本沒有提高性能? 對於這個范圍的總和,如何減少使用線程的時間?
固定版本的線程代碼:
// Compute the sum of start ... end
class Summer {
public:
long long int start;
long long int end;
long long int sum = 0;
Summer(long long int aStart, long long int aEnd)
: start(aStart),
end(aEnd)
{
}
void funcSum()
{
sum = 0;
for (auto i = start; i <= end; ++i)
{
sum += i;
}
}
};
class SummerFunctor {
Summer& mSummer;
public:
SummerFunctor(Summer& aSummer)
: mSummer(aSummer)
{
}
void operator()()
{
mSummer.funcSum();
}
};
// Version with n thread objects reports
// 1 threads, sum = 1805000000949999955, 1587 ms
// 2 threads, sum = 1805000000949999955, 2547 ms
// 4 threads, sum = 1805000000949999955, 1251 ms
// 6 threads, sum = 1805000000949999955, 916 ms
int main()
{
long long int start = 10, end = 1900000000;
long long int sum = 0;
auto startTime = high_resolution_clock::now();
const size_t threadCount = 6;
if (threadCount < 2) {
funcSum(start, end, &sum);
} else {
Summer* summers[threadCount];
std::thread* threads[threadCount];
// Start threads
auto val = start;
auto partitionSize = (end-start) / threadCount;
for (size_t i = 0; i < threadCount; ++i) {
auto partitionEnd = std::min(start + partitionSize, end);
summers[i] = new Summer(start, partitionEnd);
start = partitionEnd + 1;
SummerFunctor functor (*summers[i]);
threads[i] = new std::thread(functor);
}
// Join threads
for (size_t i = 0; i < threadCount; ++i) {
threads[i]->join();
sum += summers[i]->sum;
delete threads[i];
delete summers[i];
}
}
auto stopTime = high_resolution_clock::now();
auto duration = duration_cast<milliseconds>(stopTime - startTime);
cout << threadCount << " threads, sum = " << sum << ", " << duration.count() << " ms" << std::endl;
return 0;
}
我不得不用函子包裝 Summer object,因為 std::thread 堅持要復制一個交給它的函子,我們以后無法訪問。 當使用更多線程時,執行會變得更好(運行時間見注釋)。 可能的原因:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.