简体繁体 English

C ++设计的多线程排序？期货vs线程池vs其他？

[英]C++ design of a multi-threaded sort? Futures vs Threadpool vs others?

原文 2017-10-17 10:11:57 0 1 c++/ multithreading/ sorting

I'm looking to parallelize bucket sort as an exercise. 我希望将存储桶排序并行化为练习。 I am sorting integers lexiographically. 我按字母顺序排序整数。

The motivation is, this is my first time doing any form of parallel programming. 动机是，这是我第一次进行任何形式的并行编程。 I'm trying to learn about the different ways of doing it, advantages/disadvantages. 我正在尝试了解执行此操作的不同方法，优点/缺点。

Eg {1,6,2,6778,8,2,43,52,23} -> [1 2 2 23 43 52 6 6778 8] 例如{1,6,2,6778,8,2,43,52,23}-> [1 2 2 23 43 52 6 6778 8]

There are 3 steps to the task: 该任务有3个步骤：

Initialize 9 vectors, 1 for each final bucket. 初始化9个向量，每个最终存储桶1个。

1) sort a chunk into buckets. 1）将大块分类到存储桶中。 This step parallelized by giving each thread a portion of the data. 通过为每个线程分配一部分数据来并行化此步骤。

2) sort each bucket into lexicographic order 2）将每个存储桶按字典顺序排序

3) concat all buckets 3）连接所有水桶

Visualization of the sort: 可视化的排序：

Option 1: Threadpool I'm considering either dividing up all those tasks into jobs for 2 different functions, a bucketize function and a sort_bucket function then feeding them into a thread pool. 选项1：线程池我正在考虑将所有这些任务划分为两个不同函数（一个bucketize函数和一个sort_bucket函数）的作业，然后将它们馈送到线程池中。

Option 2: Futures Alternatively create futures of the functions and wait at the end of each step. 选项2：期货或者，创建功能的期货，并在每个步骤的最后等待。 Wait for all futures to return at the end of step 1, then create futures of sort_bucket in step 2 and join them. 等待所有期货在步骤1结束时返回，然后在步骤2中创建sort_bucket的期货并将其加入。 Can anyone provide any opinions on these methods? 谁能对这些方法提供任何意见？

CPU utilization: I can be sure that in the threadpool version I am using the appropirate number of threads as regards to available processors. CPU使用率：我可以肯定的是，在线程池版本中，关于可用处理器，我正在使用适当数量的线程。 In futures, they would be scheduled appropriately by the OS? 在未来，它们将由操作系统适当安排吗？

Are there other ways I've missed out on? 还有其他我错过的方法吗？ I'm trying to learn so would like to compare all the possible methods of doing this. 我正在尝试学习，因此想比较所有可能的方法。

Thanks! 谢谢！

1 个解决方案

You could sort subsequences of the initial array (in parallel, so in different threads) then merge them. 您可以对初始数组的子序列进行排序（以并行方式，因此在不同的线程中），然后合并它们。

BTW the overhead is not negligible. 顺便说一句，开销是不可忽略的。 You probably need to get an initial array of many dozen of thousands to observe a gain in parallelisation, and you are likely to sometimes observe some loss (eg with a too small initial array). 您可能需要获得成千上万的初始阵列才能观察到并行化的收益，并且有时可能会观察到一些损失（例如，初始阵列太小）。

And for a first multi-threaded project, I'll rather suggest having a (nearly) fixed small set of threads (at most a dozen of them, assuming your computer has 8 cores). 对于第一个多线程项目，我宁愿建议使用一组（几乎）固定的小线程集（假设您的计算机具有8个内核，最多只能有十二个线程集）。 So both thread pools and futures are IMHO too complex for that. 因此，恕我直言，线程池和期货都太复杂了。

Threads are heavy and expensive. 线程又重又昂贵。 They need at least a call stack (of a megabyte) each, and actually much more. 他们每个人至少需要一个调用堆栈（兆字节），实际上需要更多。

Don't forget synchronization (eg with mutexes). 不要忘记同步（例如，使用互斥锁）。

This is a Pthread tutorial that you could adapt to C++ threads. 这是您可以适应C ++线程的Pthread教程。