信号安全使用sem_wait（）/ sem_post（）

Question

I am trying to create a wrapper on Linux which controls how many concurrent executions of something are allowed at once. 我试图在Linux上创建一个包装器，它控制一次允许多少次并发执行。 To do so, I am using a system wide counting semaphore. 为此，我使用系统范围的计数信号量。 I create the semaphore, do a sem_wait() , launch the child process and then do a sem_post() when the child terminates. 我创建信号量，执行sem_wait() ，启动子进程，然后在子进程终止时执行sem_post() 。 That is fine. 那样就好。

The problem is how to safely handle signals sent to this wrapper. 问题是如何安全地处理发送给这个包装器的信号。 If it doesn't catch signals, the command might terminate without doing a sem_post() , causing the semaphore count to permanently decrease by one. 如果它没有捕获信号，则命令可能会在不执行sem_post()情况下终止，从而导致信号量计数永久减少1。 So, I created a signal handler which does the sem_post() . 所以，我创建了一个执行sem_post()的信号处理程序。 But still, there is a problem. 但是，仍有一个问题。

If the handler is attached before the sem_wait() is performed, a signal could arrive before the sem_wait() completes, causing a sem_post() to occur without a sem_wait() . 如果在执行sem_wait()之前附加了处理程序，则信号可能在sem_wait()完成之前到达，从而导致sem_post()在没有sem_wait()情况下发生。 The reverse is possible if I do the sem_wait() before setting up the signal handler. 如果我在设置信号处理程序之前执行sem_wait() ，则可以sem_wait() 。

The obvious next step was to block signals during the setup of the handler and the sem_wait() . 显而易见的下一步是在处理程序和sem_wait()的设置期间阻止信号。 This is pseudocode of what I have now: 这是我现在拥有的伪代码：

void handler(int sig)
{
  sem_post(sem);
  exit(1);
}

...
sigprocmask(...);   /* Block signals */
sigaction(...);     /* Set signal handler */
sem_wait(sem);
sigprocmask(...);   /* Unblock signals */
RunChild();
sem_post(sem);
exit(0);

The problem now is that the sem_wait() can block and during that time, signals are blocked. 现在的问题是sem_wait()可以阻塞，在此期间，信号被阻塞。 A user attempting to kill the process may end up resorting to "kill -9" which is behaviour I don't want to encourage since I cannot handle that case no matter what. 试图杀死进程的用户可能最终诉诸“kill -9”，这是我不想鼓励的行为，因为无论如何我都无法处理这种情况。 I could use sem_trywait() for a small time and test sigpending() but that impacts fairness because there is no longer a guarantee that the process waiting on the semaphore the longest will get to run next. 我可以使用sem_trywait()一小段时间并测试sigpending()但这会影响公平性，因为不再保证等待信号量最长的进程将在下一次运行。

Is there a truly safe solution here which allows me to handle signals during semaphore acquisition? 这里有一个真正安全的解决方案，允许我在信号量采集期间处理信号吗？ I am considering resorting to a "Do I have the semaphore" global and removing the signal blocking but that is not 100% safe since acquiring the semaphore and setting the global isn't atomic but might be better than blocking signals while waiting. 我正在考虑求助于“我有信号量”全局并消除信号阻塞，但这不是100％安全，因为获取信号量并设置全局不是原子的，但可能比等待时阻塞信号更好。

Answer 1

Are you sure sem_wait() causes signals to be blocked? 你确定sem_wait()导致信号被阻止吗？ I don't think this is the case. 我不认为是这种情况。 The man page for sem_wait() says that the EINTR error code is returned from sem_wait() if it is interrupted by a signal. sem_wait()的手册页说如果它被信号中断，则从sem_wait()返回EINTR错误代码。

You should be able to handle this error code and then your signals will be received. 您应该能够处理此错误代码，然后您的信号将被接收。 Have you run into a case where signals have not been received? 您是否遇到未收到信号的情况？

I would make sure you handle the error codes that sem_wait() can return. 我会确保你处理sem_wait()可以返回的错误代码。 Although it may be rare, if you want to be 100% sure you want to cover 100% of your bases. 虽然可能很少见，但如果你想100％确定你想要100％的基数。

Answer 2

I know this is old, but for the benefit of those still reading this courtesy of Google... 我知道这已经过时了，但为了那些仍在阅读Google礼貌的人的利益......

The simplest (and only?) robust solution to this problem is to use a System V semaphore, which allows the client to acquire the semaphore resource in a way which is automatically returned by the kernel NO MATTER HOW THE PROCESS EXITS. 这个问题的最简单（也是唯一的）稳健解决方案是使用System V信号量，它允许客户端以一种内核自动返回的方式获取信号量资源。无论过程如何排除。

Answer 3

Are you sure you are approaching the problem correctly? 你确定你正确地接近了这个问题吗？ If you want to wait for a child terminating, you may want to use the waitpid() system call. 如果要等待子进程终止，可能需要使用waitpid()系统调用。 As you observed, it is not reliable to expect the child to do the sem_post() if it may receive signals. 正如您所观察到的，如果孩子可能接收到信号，那么期望孩子做sem_post()是不可靠的。

信号安全使用sem_wait（）/ sem_post（）

问题描述

3 个解决方案

解决方案1
7 已采纳 2009-06-01 23:39:37

解决方案2
0 2012-10-11 22:40:02

解决方案3
0 2009-06-02 00:04:22

信号安全使用sem_wait（）/ sem_post（）

问题描述

3 个解决方案

解决方案1 7 已采纳 2009-06-01 23:39:37

解决方案2 0 2012-10-11 22:40:02

解决方案3 0 2009-06-02 00:04:22

解决方案1
7 已采纳 2009-06-01 23:39:37

解决方案2
0 2012-10-11 22:40:02

解决方案3
0 2009-06-02 00:04:22