wait function in the book of unix network programming?

Question

In chapter 5 in the book Unix Network Programming by Stevens et al, there are a server program and a client program as follows:

server

mysignal(SIGCHLD, sig_child);
for(;;)
{
    connfd = accept(listenfd, (struct sockaddr *)&ca, &ca_len);

    pid = fork();
    if(pid == 0)
    {
        //sleep(60);
        close(listenfd);
        str_echo(connfd);
        close(connfd);
        exit(0);
    }
    close(connfd);
}

function sig_child works to handle signal SIGCHLD; the code is as follows:

void sig_child(int signo)
{
    pid_t pid;
    int stat;
    static i = 1;
    i++;
    while(1)
    {
        pid = wait(&stat);
        if(pid > 0)
        {
            printf("ith: %d, child %d terminated\n", i, pid);
        }
        else
        {
            break;
        }
    }   
    //pid = wait(&stat);
    return;
}

client

for(i = 0 ; i < 5; i++)
{
    sockfd[i] = socket(AF_INET, SOCK_STREAM, 0);
    if(sockfd[i] < 0)
    {
        perror("create error");
        exit(-1);
    }

    memset(&sa, 0, sizeof(sa));
    sa.sin_family = AF_INET;
    sa.sin_port = htons(5900);

    if(inet_pton(AF_INET, argv[1], &sa.sin_addr) != 1)
        {
                perror("inet_pton error");
                exit(-1);
        }
    connect(sockfd[i], (struct sockaddr *)&sa, sizeof(sa));
}
str_cli(sockfd[0], stdin);
exit(0);

As you can see in the source code of the client, the program will establish five connections to the server end, but only one connection is used in the program; after str_cli is finished, exit(0) is called. And all the connections should be closed, then the five child processes in the server will exit, and SIGCHLD is sent to the parent process, which uses function sig_child to handle the SIGCHLD. Then the while loop will confirm all the child processes will be waited by parent process correctly. And I test the program for a couple of times; it works well, all the children will be cleaned.

But in the book, the authors wrote that "wait could not work properly, because function wait may become blocked before all the child processes exit" . So is the statement right in the book? If it's right, could you please explain it in more detail. (PS: I think wait in a while statement would properly handle the exit of all the child processes.)

Answer 1

My two cents about what author meant with that statement.

SIGCHLD gets raised in following three conditions

1) Child exits 2) Child interrupted 3) Child continued.

In case of 2 and 3 wait will be blocked because child has yet not exited.

Answer 2

The problem is not with wait , but with signal delivery. The sig_chld function in the book doesn't have the while loop, it only waits for one child

void sig_child(int signo)
{
    pid_t pid;
    int stat;
    pid = wait(&stat);
    printf("child %d terminated\n", pid);
    return;
}

When the client exits, all connections are closed and all the children eventually terminate. Now, the first SIGCHLD signal is delivered and upon entering the signal handler, the signal is blocked. Any further signal won't be queued and is therefore lost, causing zombie children in the server.

You can fix this by wrapping wait in some loop, as you did. Another solution is to ignore SIGCHLD explicitly, which is valid when you don't need the exit status of your children.

While wait in a loop finally waits for all children, it has the drawback, that wait blocks, if there are still children running. This means the process is stuck in the signal handler until all children are terminated.

The solution in the book is to use waitpid with option WNOHANG in a loop

while ((pid = waitpid(-1, &stat, WNOHANG)) > 0)
    printf("child %d terminated\n", pid);

This loop waits for all terminated children, but exits as soon as possible, even if there are running children.

To reproduce the server hanging in the signal handler, you must do the following

start server
start first client
start second client
close one of the clients
start a third client
enter text in the third client
You won't get a response

wait function in the book of unix network programming?

Question

2 answers

solution1
0 2014-05-22 09:49:57

solution2
0 ACCPTED 2014-05-22 13:18:00

wait function in the book of unix network programming?

Question

2 answers

solution1 0 2014-05-22 09:49:57

solution2 0 ACCPTED 2014-05-22 13:18:00

solution1
0 2014-05-22 09:49:57

solution2
0 ACCPTED 2014-05-22 13:18:00