Is there any way to speedup the future.get() function after launching the std::async function in a multithreaded environment?

Question

I have a correctly working code to process 1000 image files using openCV. As the operation is independent of individual files, I have used multithreading using the std::async function.

I am launching the threads by following function call inside for loop.

std::vector<std::future<cv::Mat>> processingThread;
for (int i = 0; i < jsonObjects.size(); i++) {
        processingThread.emplace_back(std::async(std::launch::async, (cv::Mat(CameraToBEV::*)(json&, std::vector<cv::Point2f>, cv::Point2f, cv::Point2f)) & CameraToBEV::process, &cbevVec[i], std::ref(jsonObjects[i]), roiBox1, opDim,ipDim));
    }

Above code is working fine and taking about 100 milliseconds . But to collect the results I am using another for loop as follows;

std::vector<cv::Mat> opMatArr;
for (auto& future : processingThread) {
        opMatArr.emplace_back(future.get());
       }

This is also working fine but it is taking 9 seconds to execute, which kind of defeating the purpose of using multithreading as I am sequentially populating the vector of cv::Mat objects. Is there any way like, parallelly, as in, in few milliseconds I should be able to get all the cv::Mat objects in the std::vector opMatArr ?

Answer 1

Several things come to mind:

You say this is "defeating the purpose of using multithreading". What is the runtime of this code if you run it sequentially (ie remove the multithreading code and process each image in a loop)? I would bet it's a lot more than 9 seconds.
The std::async calls only create the task/thread/whatever, but don't start it, nor do you have any guarantee it will finish after a certain time. When you call get() , you force your program to wait on it, and a decent C++ library will yield execution to the thread you're waiting on. This is not a strict guarantee, but any behaviour otherwise would make this kind of code useless from the start. Measuring the startup of the threads is useful but doesn't do much in the sense of measuring how long the actual operation takes (unless the overhead of the threads is larger than the runtime of the combined operations).
You should std::move the result from the future. It seems like you may be creating a copy of the data, which may be impacting performance for no good reason.
Creating this amount of threads will irrevocably lead to contention of resources, be it disk or memory bandwidth, CPU, or even memory size. In general it is better to set up a thread pool of sorts, in which N threads (where N is the amount of cores or "multithreaded cores" available on the system), and some sort of task queue that handles starting new jobs when threads free up. Note that std::async may indeed already do this under the hood (as far as I remember an implementation is free to implement this with pooling or not, also dependent on other options passed to std::async ).

Is there any way to speedup the future.get() function after launching the std::async function in a multithreaded environment?

Question

1 answers

solution1
2 2021-02-02 12:57:23

Is there any way to speedup the future.get() function after launching the std::async function in a multithreaded environment?

Question

1 answers

solution1 2 2021-02-02 12:57:23

solution1
2 2021-02-02 12:57:23