简体   繁体   中英

C++ Thread execution order in a thread pool

Does anyone know of a C++ thread pool implementation that allows both parallel threading (like a typical thread pool) but also allows for back to back serial execution order. I have spent several days trying to make this work by modifying the following thread pool but I cannot seem to make it work. I have looked into the techniques used by intel TBB, and also I looked into possibly using the concepts from microsoft's PPL (its asynchronous agents library looks promising) - both of which have task oriented techniques to achieve the above - Unfortunately however, these solutions will not work my target PowerPC linux embedded target.

EDIT I put together a live coliru demo with source that produces the thread graph - and also shows a good example of a scheduler_loop where theoretically one could wait for threads to complete. The code also shows a UtlThreadPool with 2 threads where I feed it with the concurent tasks - however that 'feeding' is not fully correct and will need a little work to traverse through the nodes.

The data structure that I use to make an execution graph is shown below. It uses a PriorityNode data structure. This structure is essentially a linked list of PriorityNodes, each one contains a vector of PriorityLevel tasks that can run concurrently and a pointer to the next PriorityNode which indicates the threads to be run serially afterwards. Once these have ALL completed, if the mNextNode member is not a nullptr, then this should be scheduled to run in the thread pool (and so forth until the mNextNode is nullptr. Sequencing through this linked list of PriorityNodes is how I would like the thread pool to sequence through its threads. The PriorityNode has insertion operator that typically produces output as follows. (this would mean that 1A1 can be run concurrently with 1A2 and when both of these threads have completed the next PriorityNode would allow 1B1, 1B2, 1B3 and 1B4 to run concurrently - on however many threads the pool has available.

1A1
1A2
+-1B1
+-1B2
+-1B3
+-1B4

The nearest thing I have seem to a solution to this problem - again note it is intel specific and I am on power PC is the intel TBB - here is the example they use for serial execution order.

/**
 * Branch representing fundamental building block of
 * a priority tree containing szPriority entries.<p>
 *
 * Each priority tree struct contains a vector of concurrent
 * priorities that can be scheduled to run in the thread pool -
 * note that the thread pool must have no entries associated
 * with the current channel running before enqueueing these
 * tasks. The application must wait for the thread pool to
 * complete these tasks before queuing up the dependent tasks
 * described in the mNextNode smart pointer. If mNextNode is
 * unassigned (nullptr), then we have reached the end of the
 * tree.
 */
struct PriorityNode {
    explicit PriorityNode(
        const std::vector<PriorityLevel>& rConcurrent,
        const std::shared_ptr<PriorityNode>& rNext = std::shared_ptr<PriorityNode>(),
        const size_t& rDepth = 0)
        : mConcurrent(rConcurrent)
        , mNextNode(rNext)
        , mDepth(rDepth)
    {}

    /**
    * Stream insert operator<p>
    *
    * @param os     [in,out] output stream
    * @param rhs    [in] PriorityLevel to send to the output
    *               stream.
    *
    * @return a reference to the updated stream
    */
    inline friend std::ostream& operator << (
        std::ostream& os, const PriorityNode& rhs) {
        // indent 2 spaces per depth level
        std::string indent = rhs.mDepth > 0 ?
            (std::string("+") +
            std::string((rhs.mDepth * 2) - 1, '-')) :
            std::string();
        // print out the concurrent threads that 
        // can be scheduled with the thread pool
        for (const auto& next : rhs.mConcurrent) {
            os << indent << next << std::endl;
        }
        // print the dependent priorities that can only
        // be scheduled when the concurrent ones are finished
        if (rhs.mNextNode) {
            os << *rhs.mNextNode << std::endl;
        }
        return os;
    }
    // these are all equivalent thread priorities
    // that can be run simultaneously
    std::vector<PriorityLevel> mConcurrent;

    // these are concurrent threads that must be AFTER all 
    // mConcurrent tasks have completed (exiting the thread pool)
    std::shared_ptr<PriorityNode> mNextNode;

    // recursion depth
    size_t mDepth;
};

Why not just use TBB on PowerPC? It is a highly portable library designed to be as cross-platform as practical; and I've heard it is [being] ported on BlueGen by TBB open-source community. You can ask them on the Intel TBB forum, for example, by reviving this forum thread .

Intel does not distribute PowerPC binaries for TBB but you can try build it from sources simply by

make tbb

See also these community patches .

If anyone is still looking for this, check out the repo here - https://github.com/hirak99/ordered_thread_pool

Basically with this you can replace code like this -

while (...) {
  std::cout << CostlyFn(input)) << std::endl;
}

Into this -

OrderedThredPool<std::string> pool{10, 1};
while (...) {
  pool.Do(
    [&input] { return CostlyFn(input); },
    [](const std::string& out) {
      std::cout << out << std::endl;
    });
}

And it will execute in order.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM