无法使用pthread_kill和sigwait解除阻止/“唤醒”线程

Question

I'm working on a C/C++ networking project and am having difficulties synchronizing/signaling my threads. 我正在研究C / C ++网络项目，并且在同步/给线程发送信号方面遇到困难。 Here is what I am trying to accomplish: 这是我要完成的工作：

Poll a bunch of sockets using the poll function 使用poll函数轮询一堆套接字
If any sockets are ready from the POLLIN event then send a signal to a reader thread and a writer thread to "wake up" 如果在POLLIN事件中准备好任何套接字，则将信号发送到读取器线程和写入器线程以“唤醒”

I have a class called MessageHandler that sets the signals mask and spawns the reader and writer threads. 我有一个称为MessageHandler的类，该类设置信号掩码并生成读取器和写入器线程。 Inside them I then wait on the signal(s) that ought to wake them up. 然后，在它们内部，我等待应该唤醒它们的信号。

The problem is that I am testing all this functionality by sending a signal to a thread yet it never wakes up. 问题是我正在通过向线程发送信号来测试所有这些功能，但它永远不会唤醒。

Here is the problem code with further explanation. 这是带有进一步说明的问题代码。 Note I just have highlighted how it works with the reader thread as the writer thread is essentially the same. 注意，我刚刚强调了它如何与阅读器线程一起工作，因为编写器线程本质上是相同的。

// Called once if allowedSignalsMask == 0 in constructor
// STATIC
void MessageHandler::setAllowedSignalsMask() {
     allowedSignalsMask = (sigset_t*)std::malloc(sizeof(sigset_t));
     sigemptyset(allowedSignalsMask);
     sigaddset(allowedSignalsMask, SIGCONT);
}

// STATIC
sigset_t *MessageHandler::allowedSignalsMask = 0;

// STATIC
void* MessageHandler::run(void *arg) {
    // Apply the signals mask to any new threads created after this point
    pthread_sigmask(SIG_BLOCK, allowedSignalsMask, 0);

    MessageHandler *mh = (MessageHandler*)arg;
    pthread_create(&(mh->readerThread), 0, &runReaderThread, arg);

    sleep(1); // Just sleep for testing purposes let reader thread execute first
    pthread_kill(mh->readerThread, SIGCONT);
    sleep(1); // Just sleep for testing to let reader thread print without the process terminating

    return 0;
}

// STATIC
void* MessageHandler::runReaderThread(void *arg) {
    int signo;
    for (;;) {
            sigwait(allowedSignalsMask, &signo);

            fprintf(stdout, "Reader thread signaled\n");
    }

    return 0;
}

I took out all the error handling I had in the code to condense it but do know for a fact that the thread starts properly and gets to the sigwait call. 我删除了代码中的所有错误处理以压缩它，但确实知道线程正常启动并进入sigwait调用。

The error may be obvious (its not a syntax error - the above code is condensed from compilable code and I might of screwed it up while editing it) but I just can't seem to find/see it since I have spent far to much time on this problem and confused myself. 该错误可能很明显（它不是语法错误-上面的代码是由可编译代码压缩而成的，我可能在编辑时将其搞砸了），但由于我花了很多时间，我似乎还是找不到/看到它时间在这个问题上，使自己困惑。

Let me explain what I think I am doing and if it makes sense. 让我解释一下我认为我在做什么以及是否有意义。

Upon creating an object of type MessageHandler it will set allowedSignalsMask to the set of the one signal (for the time being) that I am interested in using to wake up my threads. 创建类型为MessageHandler的对象后，它将把allowedSignalsMask设置为我感兴趣的用于唤醒线程的一个信号（暂时）的集合。
I add the signal to the blocked signals of the current thread with pthread_sigmask. 我使用pthread_sigmask将信号添加到当前线程的阻塞信号中。 All further threads created after this point ought to have the same signal mask now. 此后创建的所有其他线程现在应该具有相同的信号掩码。
I then create the reader thread with pthread_create where arg is a pointer to an object of type MessageHandler. 然后，我使用pthread_create创建读取器线程，其中arg是指向MessageHandler类型的对象的指针。
I call sleep as a cheap way to ensure that my readerThread executes all the way to sigwait() 我称睡眠为一种便宜的方法，以确保我的readerThread一直执行到sigwait（）
I send the signal SIGCONT to the readerThread as I am interested in sigwait to wake up/unblock once receiving it. 我将信号SIGCONT发送到readerThread，因为我对sigwait感兴趣，一旦收到它就唤醒/取消阻止。
Again I call sleep as a cheap way to ensure that my readerThread can execute all the way after it woke up/unblocked from sigwait() 再次，我将睡眠称为一种廉价的方法，以确保我的readerThread在从sigwait（）唤醒/解除阻塞后可以执行所有方法。

Other helpful notes that may be useful but I don't think affect the problem: 其他有用的注释可能有用，但我认为不会影响该问题：

MessageHandler is constructed and then a different thread is created given the function pointer that points to run. 构造MessageHandler，然后在给定要运行的函数指针的情况下创建另一个线程。 This thread will be responsible for creating the reader and writer threads, polling the sockets with the poll function, and then possibly sending signals to both the reader and writer threads. 此线程将负责创建读取器和写入器线程，使用poll函数轮询套接字，然后可能同时向读取器和写入器线程发送信号。

I know its a long post but do appreciate you reading it and any help you can offer. 我知道这是一篇很长的文章，但是非常感谢您阅读并提供任何帮助。 If I wasn't clear enough or you feel like I didn't provide enough information please let me know and I will correct the post. 如果我不够清楚，或者您觉得我没有提供足够的信息，请告诉我，我将予以纠正。

Thanks again. 再次感谢。

Answer 1

POSIX threads have condition variables for a reason; POSIX线程具有条件变量是有原因的。 use them. 使用它们。 You're not supposed to need signal hackery to accomplish basic synchronization tasks when programming with threads. 使用线程编程时，您不需要信号黑客来完成基本的同步任务。

Here is a good pthread tutorial with information on using condition variables: 这是一个很好的pthread教程，其中包含有关使用条件变量的信息：

https://computing.llnl.gov/tutorials/pthreads/ https://computing.llnl.gov/tutorials/pthreads/

Or, if you're more comfortable with semaphores, you could use POSIX semaphores ( sem_init , sem_post , and sem_wait ) instead. 或者，如果您对信号量更满意，则可以改用POSIX信号量（ sem_init ， sem_post和sem_wait ）。 But once you figure out why the condition variable and mutex pairing makes sense, I think you'll find condition variables are a much more convenient primitive. 但是一旦弄清楚了为什么条件变量和互斥锁配对有意义，我想您会发现条件变量是一个更加方便的原语。

Also, note that your current approach incurs several syscalls (user-space/kernel-space transitions) per synchronization. 另外，请注意，您的当前方法每次同步都会引发多个系统调用（用户空间/内核空间转换）。 With a good pthreads implementation, using condition variables should drop that to at most one syscall, and possibly none at all if your threads keep up with each other well enough that the waited-for event occurs while they're still spinning in user-space. 通过良好的pthreads实现，使用条件变量应该最多将其降至一个syscall，并且如果线程之间的相互配合程度足够好以至于当它们仍在用户空间中旋转时发生等待事件，则可能根本不进行任何调用。。

Answer 2

This pattern seems a bit odd, and most likely error prone. 这种模式似乎有点奇怪，并且很可能容易出错。 The pthread library is rich in synchronization methods, the one most likely to serve your need being in the pthread_cond_* family. pthread库包含许多同步方法，最有可能满足您需求的方法是pthread_cond_*系列。 These methods handle condition variables , which implement the Wait and Signal approach. 这些方法处理条件变量，这些条件变量实现了Wait and Signal方法。

Answer 3

Use SIGUSR1 instead of SIGCONT. 使用SIGUSR1而不是SIGCONT。 SIGCONT doesn't work. SIGCONT不起作用。 Maybe a signal expert knows why. 也许信号专家知道原因。

By the way, we use this pattern because condition variables and mutexes are too slow for our particular application. 顺便说一下，我们使用这种模式是因为条件变量和互斥体对于我们的特定应用程序来说太慢了。 We need to sleep and wake individual threads very rapidly. 我们需要非常快速地睡眠和唤醒各个线程。

R. points out there is extra overhead due to additional kernel space calls. R.指出由于额外的内核空间调用而产生了额外的开销。 Perhaps if you sleep > N threads, then a single condition variable would beat out multiple sigwaits and pthread_kills. 也许如果您睡眠> N个线程，则单个条件变量将击败多个sigwait和pthread_kills。 In our application, we only want to wake one thread when work arrives. 在我们的应用程序中，我们只想在工作到达时唤醒一个线程。 You have to have a condition variable and mutex for each thread to do this otherwise you get the stampede. 您必须为每个线程具有一个条件变量和互斥体才能执行此操作，否则将获得踩踏信号。 In a test where we slept and woke N threads M times, signals beat mutexes and condition variables by a factor of 5 (it could have been a factor of 40 but I cant remember anymore....argh). 在一项测试中，我们睡了N次并唤醒N个线程，信号比互斥量和条件变量高出5倍（原本可能是40倍，但我不记得了。。。） We didn't test Futexes which can wake 1 thread at a time and specifically are coded to limit trips to kernel space. 我们没有测试过一次可以唤醒1个线程的Futex，并且专门对其进行了编码以限制对内核空间的访问。 I suspect futexes would be faster than mutexes. 我怀疑futex会比互斥体快。

无法使用pthread_kill和sigwait解除阻止/“唤醒”线程

问题描述

3 个解决方案

解决方案1
5 已采纳 2011-04-05 19:03:50

解决方案2
1 2011-04-05 19:06:02

解决方案3
0 2011-04-21 02:00:10

无法使用pthread_kill和sigwait解除阻止/“唤醒”线程

问题描述

3 个解决方案

解决方案1 5 已采纳 2011-04-05 19:03:50

解决方案2 1 2011-04-05 19:06:02

解决方案3 0 2011-04-21 02:00:10

解决方案1
5 已采纳 2011-04-05 19:03:50

解决方案2
1 2011-04-05 19:06:02

解决方案3
0 2011-04-21 02:00:10