[英]How many threads should I create?
Based on this question, I have a class, where its constructor does only some assignments and then there is a build()
member function which actually does the job. 基于这个问题,我有一个类,其中的构造函数仅执行一些赋值,然后有一个
build()
成员函数实际上可以完成工作。
I know that the number of objects I will have to build is in the range of [2, 16]. 我知道我将要构建的对象数在[2,16]范围内。 The actual number is a user parameter.
实际数字是用户参数。
I create my objects in a for loop like this 我在这样的for循环中创建对象
for (int i = 0; i < n; ++i) {
roots.push_back(RKD<DivisionSpace>(...));
}
and then in another for loop I create the threads. 然后在另一个for循环中创建线程。 Every thread calls
build()
in a chunk of objects, based on this logic: 每个线程都根据以下逻辑在对象块中调用
build()
:
If your vector has n elements and you have p threads, thread i writes only to elements
如果您的向量具有n个元素并且您具有p个线程,则线程i仅写入元素
[in / p, (i + 1) n / p).
[in / p,(i +1)n / p)。
So for example, the situation is like this: 因此,例如,情况如下:
std::vector<RKD<Foo>> foos;
// here is a for loop that pushes back 'n' objects to foos
// thread A // thread B // thread C
foos[0].build(); foos[n / 3 + 0].build(); foos[2 * n / 3 + 0].build();
foos[1].build(); foos[n / 3 + 1].build(); foos[2 * n / 3 + 1].build();
foos[2].build(); foos[n / 3 + 2].build(); foos[2 * n / 3 + 2].build();
... ... ...
The approach I followed was to determine the number of threads p
like this: 我采用的方法是确定线程数
p
如下所示:
p = min(n, P)
where n
is the number of objects I want to create and P
the return value of std::thread::hardware_concurrency . 其中
n
是我要创建的对象数, P
是std :: thread :: hardware_concurrency的返回值。 After dealing with some issues that C++11 feature has, I read this: 在处理了C ++ 11功能所遇到的一些问题之后,我读到以下内容:
Even when hardware_concurrency is implemented, it cannot be relied as a direct mapping to the number of cores.
即使实现了hardware_concurrency,也不能将其作为对内核数的直接映射。 This is what the standard says it returns - The number of hardware thread contexts.
这就是标准说的返回的内容-硬件线程上下文的数量。 And goes on to state - This value should only be considered to be a hint If your machine has hyperthreading enabled, it's entirely possible the value returned will be 2x the number of cores.
然后继续说明-该值仅应视为提示。如果您的计算机启用了超线程,则返回的值很可能是内核数的2倍。 If you want a reliable answer, you'll need to use whatever facilities your OS provides.
如果您想要一个可靠的答案,则需要使用操作系统提供的所有功能。 – Praetorian
– Praetorian
That means that I should probably change approach, since this code is meant to be executed from several users (and I mean not only in my system, many people are going to run that code). 这意味着我可能应该更改方法,因为该代码是要由多个用户执行的(并且我的意思是不仅在我的系统中,许多人都将运行该代码)。 So, I would like to choose the number of threads in a way that will be both standard and efficient.
因此,我想以一种既标准又有效的方式选择线程数。 Since the number of objects is relatively small, is there some rule to follow or something?
由于对象的数量相对较少,是否有一些规则可以遵循?
Just pick a thread pool of hardware_concurrency
threads and queue the items on a first come, first served basis. 只需选择一个
hardware_concurrency
线程池,并以先到先得的原则将项目排队。
If other processes in the system somehow get priority from the OS, so be it. 如果系统中的其他进程以某种方式从OS获得优先级,那就这样吧。 This simply means that fewer than the allocated pool size (eg
P - 1
) can run simultaneously. 这仅意味着可以同时运行少于分配的池大小(例如
P - 1
)。 It doesn't matter since the first available pool thread that is done build()
-ing one item will pick the next item from the queue. 没关系,因为第一个可用的池线程已完成
build()
将一个项目作为对象将从队列中选择下一个项目。
To really avoid threads competing over the same core, you could 为了真正避免线程在同一核心上竞争,您可以
use a semaphore (interprocess semaphore if you want to actually coordinate the builder threads from separate processes) 使用信号量(进程间信号量,如果您要实际协调来自单独进程的构建器线程)
thread affinity (to prevent the OS from scheduling a particular thread onto a different core the next time slice); 线程相似性(防止操作系统在下一个时间片将特定线程调度到其他内核上); sadly I don't think there is standard , platform-independent, way to set thread affinity (yet).
遗憾的是,我认为尚无标准的 ,独立于平台的方式来设置线程相似性。
I see no compelling reason to make it more complicated 我没有令人信服的理由将其变得更复杂
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.