I am having an issue with terminating worker threads from the main thread. So far each method I tried either leads to a race condition or dead lock.
The worker threads are stored in a inner class inside a class called ThreadPool, ThreadPool maintains a vector of these WorkerThreads using unique_ptr.
Here is the header for my ThreadPool:
class ThreadPool
{
public:
typedef void (*pFunc)(const wpath&, const Args&, Global::mFile_t&, std::mutex&, std::mutex&); // function to point to
private:
class WorkerThread
{
private:
ThreadPool* const _thisPool; // reference enclosing class
// pointers to arguments
wpath _pPath; // member argument that will be modifyable to running thread
Args * _pArgs;
Global::mFile_t * _pMap;
// flags for thread management
bool _terminate; // terminate thread
bool _busy; // is thread busy?
bool _isRunning;
// thread management members
std::mutex _threadMtx;
std::condition_variable _threadCond;
std::thread _thisThread;
// exception ptr
std::exception_ptr _ex;
// private copy constructor
WorkerThread(const WorkerThread&): _thisPool(nullptr) {}
public:
WorkerThread(ThreadPool&, Args&, Global::mFile_t&);
~WorkerThread();
void setPath(const wpath); // sets a new task
void terminate(); // calls terminate on thread
bool busy() const; // returns whether thread is busy doing task
bool isRunning() const; // returns whether thread is still running
void join(); // thread join wrapper
std::exception_ptr exception() const;
// actual worker thread running tasks
void thisWorkerThread();
};
// thread specific information
DWORD _numProcs; // number of processors on system
unsigned _numThreads; // number of viable threads
std::vector<std::unique_ptr<WorkerThread>> _vThreads; // stores thread pointers - workaround for no move constructor in WorkerThread
pFunc _task; // the task threads will call
// synchronization members
unsigned _barrierLimit; // limit before barrier goes down
std::mutex _barrierMtx; // mutex for barrier
std::condition_variable _barrierCond; // condition for barrier
std::mutex _coutMtx;
public:
// argument mutex
std::mutex matchesMap_mtx;
std::mutex coutMatch_mtx;
ThreadPool(pFunc f);
// wake a thread and pass it a new parameter to work on
void callThread(const wpath&);
// barrier synchronization
void synchronizeStartingThreads();
// starts and synchronizes all threads in a sleep state
void startThreads(Args&, Global::mFile_t&);
// terminate threads
void terminateThreads();
private:
};
So far the real issue I am having is that when calling terminateThreads() from main thread causes dead lock or race condition.
When I set my _terminate flag to true, there is a chance that the main will already exit scope and destruct all mutexes before the thread has had a chance to wake up and terminate. In fact I have gotten this crash quite a few times (console window displays: mutex destroyed while busy)
If I add a thread.join() after I notify_all() the thread, there is a chance the thread will terminate before the join occurs, causing an infinite dead lock, as joining to a terminated thread suspends the program indefinitely.
If I detach - same issue as above, but causes program crash
If I instead use a while(WorkerThread.isRunning()) Sleep(0); The program may crash because the main thread may exit before the WorkerThread reaches that last closing brace.
I am not sure what else to do to stop halt the main until all worker threads have terminated safely. Also, even with try-catch in thread and main, no exceptions are being caught. (everything I have tried leads to program crash)
What can I do to halt the main thread until worker threads have finished?
Here are the implementations of the primary functions:
Terminate Individual worker thread
void ThreadPool::WorkerThread::terminate()
{
_terminate = true;
_threadCond.notify_all();
_thisThread.join();
}
The actual ThreadLoop
void ThreadPool::WorkerThread::thisWorkerThread()
{
_thisPool->synchronizeStartingThreads();
try
{
while (!_terminate)
{
{
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Sleeping..." << std::endl;
_thisPool->_coutMtx.unlock();
_busy = false;
std::unique_lock<std::mutex> lock(_threadMtx);
_threadCond.wait(lock);
}
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Awake..." << std::endl;
_thisPool->_coutMtx.unlock();
if(_terminate)
break;
_thisPool->_task(_pPath, *_pArgs, *_pMap, _thisPool->coutMatch_mtx, _thisPool->matchesMap_mtx);
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Finished Task..." << std::endl;
_thisPool->_coutMtx.unlock();
}
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Terminating" << std::endl;
_thisPool->_coutMtx.unlock();
}
catch (const std::exception&)
{
_ex = std::current_exception();
}
_isRunning = false;
}
Terminate All Worker Threads
void ThreadPool::terminateThreads()
{
for (std::vector<std::unique_ptr<WorkerThread>>::iterator it = _vThreads.begin(); it != _vThreads.end(); ++it)
{
it->get()->terminate();
//it->get()->_thisThread.detach();
// if thread threw an exception, rethrow it in main
if (it->get()->exception() != nullptr)
std::rethrow_exception(it->get()->exception());
}
}
and lastly, the function that is calling the thread pool (the scan function is running on main)
// scans a path recursively for all files of selected extension type, calls thread to parse file
unsigned int Functions::Scan(wpath path, const Args& args, ThreadPool& pool)
{
wrecursive_directory_iterator d(path), e;
unsigned int filesFound = 0;
while ( d != e )
{
if (args.verbose())
std::wcout << L"Grepping: " << d->path().string() << std::endl;
for (Args::ext_T::const_iterator it = args.extension().cbegin(); it != args.extension().cend(); ++it)
{
if (extension(d->path()) == *it)
{
++filesFound;
pool.callThread(d->path());
}
}
++d;
}
std::cout << "Scan Function: Calling TerminateThreads() " << std::endl;
pool.terminateThreads();
std::cout << "Scan Function: Called TerminateThreads() " << std::endl;
return filesFound;
}
Ill repeat the question again: What can I do to halt the main thread until worker threads have finished?
I don't get the issue with thread termination and join.
Joining threads is all about waiting until the given thread has terminated, so it's exaclty what you want to do. If the thread has finished execution already, join
will just return immediately.
So you'll just want to join each thread during the terminate
call as you already do in your code.
Note: currently you immediately rethrow any exception if a thread you just terminated has an active exception_ptr
. That might lead to unjoined threads. You'll have to keep that in mind when handling those exceptions
Update: after looking at your code, I see a potential bug: std::condition_variable::wait()
can return when a spurious wakeup occurs. If that is the case, you will work again on the path that was worked on the last time, leading to wrong results. You should have a flag for new work that is set if new work has been added, and that _threadCond.wait(lock)
line should be in a loop that checks for the flag and _terminate
. Not sure if that one will fix your problem, though.
The problem was two fold:
synchronizeStartingThreads() would sometimes have 1 or 2 threads blocked, waiting for the okay to go ahead (a problem in the while (some_condition) barrierCond.wait(lock). The condition would sometimes never evaluate to true. removing the while loop fixed this blocking issue.
The second issue was the potential for a worker thread to enter the _threadMtx, and notify_all was called just before they entered the _threadCond.wait(), since notify was already called, the thread would wait forever.
ie.
{
// terminate() is called
std::unique_lock<std::mutex> lock(_threadMtx);
// _threadCond.notify_all() is called here
_busy = false;
_threadCond.wait(lock);
// thread is blocked forever
}
surprisingly, locking this mutex in terminate() did not stop this from happening.
This was solved by adding a timeout of 30ms to the _threadCond.wait()
Also, a check was added before the starting of task to make sure the same task wasn't being processed again.
The new code now looks like this:
thisWorkerThread
_threadCond.wait_for(lock, std::chrono::milliseconds(30)); // hold the lock a max of 30ms
// after the lock, and the termination check
if(_busy)
{
Global::mFile_t rMap = _thisPool->_task(_pPath, *_pArgs, _thisPool->coutMatch_mtx);
_workerMap.element.insert(rMap.element.begin(), rMap.element.end());
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.