In chapter 5 in the book Unix Network Programming by Stevens et al, there are a server program and a client program as follows:
server
mysignal(SIGCHLD, sig_child);
for(;;)
{
connfd = accept(listenfd, (struct sockaddr *)&ca, &ca_len);
pid = fork();
if(pid == 0)
{
//sleep(60);
close(listenfd);
str_echo(connfd);
close(connfd);
exit(0);
}
close(connfd);
}
function sig_child works to handle signal SIGCHLD; the code is as follows:
void sig_child(int signo)
{
pid_t pid;
int stat;
static i = 1;
i++;
while(1)
{
pid = wait(&stat);
if(pid > 0)
{
printf("ith: %d, child %d terminated\n", i, pid);
}
else
{
break;
}
}
//pid = wait(&stat);
return;
}
client
for(i = 0 ; i < 5; i++)
{
sockfd[i] = socket(AF_INET, SOCK_STREAM, 0);
if(sockfd[i] < 0)
{
perror("create error");
exit(-1);
}
memset(&sa, 0, sizeof(sa));
sa.sin_family = AF_INET;
sa.sin_port = htons(5900);
if(inet_pton(AF_INET, argv[1], &sa.sin_addr) != 1)
{
perror("inet_pton error");
exit(-1);
}
connect(sockfd[i], (struct sockaddr *)&sa, sizeof(sa));
}
str_cli(sockfd[0], stdin);
exit(0);
As you can see in the source code of the client, the program will establish five connections to the server end, but only one connection is used in the program; after str_cli
is finished, exit(0)
is called. And all the connections should be closed, then the five child processes in the server will exit, and SIGCHLD is sent to the parent process, which uses function sig_child
to handle the SIGCHLD. Then the while
loop will confirm all the child processes will be waited by parent process correctly. And I test the program for a couple of times; it works well, all the children will be cleaned.
But in the book, the authors wrote that "wait could not work properly, because function wait may become blocked before all the child processes exit" . So is the statement right in the book? If it's right, could you please explain it in more detail. (PS: I think wait
in a while
statement would properly handle the exit of all the child processes.)
My two cents about what author meant with that statement.
SIGCHLD gets raised in following three conditions
1) Child exits 2) Child interrupted 3) Child continued.
In case of 2 and 3 wait will be blocked because child has yet not exited.
The problem is not with wait
, but with signal delivery. The sig_chld
function in the book doesn't have the while
loop, it only waits for one child
void sig_child(int signo)
{
pid_t pid;
int stat;
pid = wait(&stat);
printf("child %d terminated\n", pid);
return;
}
When the client exits, all connections are closed and all the children eventually terminate. Now, the first SIGCHLD
signal is delivered and upon entering the signal handler, the signal is blocked. Any further signal won't be queued and is therefore lost, causing zombie children in the server.
You can fix this by wrapping wait
in some loop, as you did. Another solution is to ignore SIGCHLD
explicitly, which is valid when you don't need the exit status of your children.
While wait
in a loop finally waits for all children, it has the drawback, that wait
blocks, if there are still children running. This means the process is stuck in the signal handler until all children are terminated.
The solution in the book is to use waitpid
with option WNOHANG
in a loop
while ((pid = waitpid(-1, &stat, WNOHANG)) > 0)
printf("child %d terminated\n", pid);
This loop waits for all terminated children, but exits as soon as possible, even if there are running children.
To reproduce the server hanging in the signal handler, you must do the following
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.