I am trying to create a wrapper on Linux which controls how many concurrent executions of something are allowed at once. To do so, I am using a system wide counting semaphore. I create the semaphore, do a sem_wait()
, launch the child process and then do a sem_post()
when the child terminates. That is fine.
The problem is how to safely handle signals sent to this wrapper. If it doesn't catch signals, the command might terminate without doing a sem_post()
, causing the semaphore count to permanently decrease by one. So, I created a signal handler which does the sem_post()
. But still, there is a problem.
If the handler is attached before the sem_wait()
is performed, a signal could arrive before the sem_wait()
completes, causing a sem_post()
to occur without a sem_wait()
. The reverse is possible if I do the sem_wait()
before setting up the signal handler.
The obvious next step was to block signals during the setup of the handler and the sem_wait()
. This is pseudocode of what I have now:
void handler(int sig)
{
sem_post(sem);
exit(1);
}
...
sigprocmask(...); /* Block signals */
sigaction(...); /* Set signal handler */
sem_wait(sem);
sigprocmask(...); /* Unblock signals */
RunChild();
sem_post(sem);
exit(0);
The problem now is that the sem_wait()
can block and during that time, signals are blocked. A user attempting to kill the process may end up resorting to "kill -9" which is behaviour I don't want to encourage since I cannot handle that case no matter what. I could use sem_trywait()
for a small time and test sigpending()
but that impacts fairness because there is no longer a guarantee that the process waiting on the semaphore the longest will get to run next.
Is there a truly safe solution here which allows me to handle signals during semaphore acquisition? I am considering resorting to a "Do I have the semaphore" global and removing the signal blocking but that is not 100% safe since acquiring the semaphore and setting the global isn't atomic but might be better than blocking signals while waiting.
Are you sure sem_wait()
causes signals to be blocked? I don't think this is the case. The man page for sem_wait()
says that the EINTR
error code is returned from sem_wait()
if it is interrupted by a signal.
You should be able to handle this error code and then your signals will be received. Have you run into a case where signals have not been received?
I would make sure you handle the error codes that sem_wait()
can return. Although it may be rare, if you want to be 100% sure you want to cover 100% of your bases.
I know this is old, but for the benefit of those still reading this courtesy of Google...
The simplest (and only?) robust solution to this problem is to use a System V semaphore, which allows the client to acquire the semaphore resource in a way which is automatically returned by the kernel NO MATTER HOW THE PROCESS EXITS.
Are you sure you are approaching the problem correctly? If you want to wait for a child terminating, you may want to use the waitpid()
system call. As you observed, it is not reliable to expect the child to do the sem_post()
if it may receive signals.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.