简体   繁体   English

Shell中的输出重定向如何在Linux中由C中的fork()生成的子进程工作?

[英]How the Output Redirection in shell works for the child process produced by fork() in C in Linux?

I am currently studying operating system and concurrency, one of my practice regarding process scheduler is to use C language to figure out how multiple processes work in "parallel" in Linux with a granularity of milliseconds. 我目前正在研究操作系统和并发性,有关进程调度程序的一种实践是使用C语言来计算多个进程如何在Linux中以“毫秒”粒度“并行”工作。 Here is my code: 这是我的代码:

/* This file's name is Task05_3.c */
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/time.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <errno.h>
#include <string.h>

int kill(pid_t pid, int sig);
unsigned usleep(unsigned seconds);

#define NUMBER_OF_PROCESSES 7
#define MAX_EXPERIMENT_DURATION 4

long int getDifferenceInMilliSeconds(struct timeval start, struct timeval end)
{
    int seconds = end.tv_sec - start.tv_sec;
    int useconds = end.tv_usec - start.tv_usec;
    int mtime = (seconds * 1000 + useconds / 1000);
    return mtime;
}

int main(int argc, char const *argv[])
{
    struct timeval startTime, currentTime;
    int diff;

    int log[MAX_EXPERIMENT_DURATION + 2] = {-1};
    /* initialization */
    for (int k = 0; k < MAX_EXPERIMENT_DURATION + 2; ++k)
        log[k] = -1;

    gettimeofday(&startTime, NULL);

    pid_t pid_for_diss = 0;

    for (int i = 0; i < NUMBER_OF_PROCESSES; ++i)
    {
        pid_for_diss = fork();
        if (pid_for_diss < 0) {
            printf("fork error, errno(%d): %s\n", errno, strerror(errno));
        } else if (pid_for_diss == 0) {
            /* This loop is for logging when the child process is running */
            while (1) {
                gettimeofday(&currentTime, NULL);
                diff = getDifferenceInMilliSeconds(startTime, currentTime);
                if (diff > MAX_EXPERIMENT_DURATION)
                {
                    break;
                }
                log[diff] = i;
            }
            // for (int k = 0; k < MAX_EXPERIMENT_DURATION + 2; ++k)
            // {
            //     if (log[k] != -1)
            //     {
            //         printf("%d, %d\n", log[k], k);
            //     }
            // }
            // exit(0);
            break;
        }
    }

    /* This loop is for print the logged results out */
    if (pid_for_diss == 0)
    {
        for (int k = 0; k < MAX_EXPERIMENT_DURATION + 2; ++k)
        {
            if (log[k] != -1)
            {
                printf("%d, %d\n", log[k], k);
            }
        }
        kill(getpid(), SIGKILL);
    }

    int status;
    while (wait(&status) != -1);// -1 means wait() failed
    printf("Bye from the parent!\n");
}

Basically, my idea here is that I set a for loop for the parent process to produce 7 child processes with fork() and set them into a while loop that force them to compete for the usage of CPU within a time period. 基本上,我的想法是为父进程设置一个for循环,以使用fork()产生7个子进程,并将它们设置为while循环,以迫使它们在一段时间内竞争CPU的使用。 And each time when a child process is scheduled to run, I approximately log the difference between the current time and start time of the parent process into an array belongs to the running child process. 并且每次安排子进程运行时,我大约将当前时间与父进程的开始时间之间的差记录到属于正在运行的子进程的数组中。 Then after all the 7 processes break the while loop, I set another for loop for each child processes to print out their logged result. 然后,在所有7个进程都打破了while循环之后,我为每个子进程设置了另一个for循环以打印出其记录的结果。

However, when I try to redirect the output into a .csv file in the Linux machine, something weird happened: Firstly, I set the loop for printing outside the major for loop (as you can see in my code), and I run ./Task05_3 directly in the bash and here is the result: 但是,当我尝试将输出重定向到Linux机器中的.csv文件时,发生了一些奇怪的事情:首先,我将循环设置为在主for循环之外进行打印(如您在代码中所见),然后运行./Task05_3直接放在bash中,结果如下:

psyhq@bann:osc$ gcc -std=c99 Task05_3.c -o Task05_3
psyhq@bann:osc$ ./Task05_3
5, 0
4, 0
6, 0
4, 1
1, 0
4, 2
4, 3
4, 4
0, 0
1, 1
6, 1
1, 2
1, 3
1, 4
5, 1
5, 2
5, 3
5, 4
6, 2
6, 3
2, 0
6, 4
2, 1
2, 2
2, 3
2, 4
0, 1
3, 0
0, 2
0, 3
0, 4
3, 1
3, 2
3, 3
3, 4
Bye from the parent!
psyhq@bann:osc$

You can see here that all the results (both from parent process and child processes) have been printed out in the terminal and the result of the child processes is in random order (which I think can be due to multiple processes writing to the standard output at the same time). 您可以在此处看到所有结果(来自父过程和子过程)都已在终端中打印出来,并且子过程的结果是随机顺序的(我认为这可能是由于多个过程写入了标准输出)与此同时)。 However, if I try to run it by ./Task05_3 > 5output_c.csv I will find that my targeted .csv file only contains the result coming from the parent process, it looks like: Result_in_csv01 但是,如果尝试通过./Task05_3 > 5output_c.csv运行它,我会发现我的目标.csv文件仅包含来自父进程的结果,它看起来像: Result_in_csv01

So my first question is how can the .csv file only contains parent process's prompt? 所以我的第一个问题是.csv文件如何仅包含父进程的提示? Is it because the instruction I typed in bash only redirects the parent process's output and has nothing to do with the child process' output stream? 是因为我在bash中键入的指令仅重定向了父进程的输出,而与子进程的输出流无关?

What's more, when I try to put the for loop (for printing) inside the major for loop (refer to the commented for loop in my code above) and run the code by ./Task05_3 > 5output_c.csv something more confusing happened, the .csv file now looks like: Result_in_csv02 而且,当我尝试将for循环(用于打印)放入主for循环(请参阅上面我的代码中注释的for循环)并通过./Task05_3 > 5output_c.csv运行代码时,发生了更令人困惑的事情, .csv文件现在看起来像: Result_in_csv02

It now contains all the results! 现在它包含所有结果! And the order of the child processes' result is not random any more!! 子进程结果的顺序不再是随机的! (Clearly the other child processes keep waited until the running child process printed all its results out). (显然,其他子进程一直等待,直到正在运行的子进程将其所有结果打印出来)。 So my second question is that how this can happen after I simply changed the position of my for loop? 因此,我的第二个问题是,仅更改了for循环的位置后怎么办?

PS. PS。 The Linux machine I ran my code on is in: 我运行代码的Linux机器位于:

psyhq@bann:osc$ cat /proc/version
Linux version 3.10.0-693.2.2.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Tue Sep 12 22:26:13 UTC 2017

And the GCC version is: GCC版本是:

psyhq@bann:osc$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Output via stdio functions is buffered by default. 默认情况下,通过stdio函数的输出会被缓冲。 That means it's not written immediately, but accumulates in some internal structure (inside of FILE ) until ... something happens. 这意味着它不会立即写入,而是会累积在某些内部结构中(在FILE内部),直到...发生某些事情为止。 There are three possibilities: 有三种可能性:

  • A FILE is unbuffered. FILE是无缓冲的。 Then output is written immediately. 然后立即写入输出。
  • Line buffered. 行缓冲。 Output is written when the buffer is full or when a '\\n' (newline) is seen. 当缓冲区已满或看到'\\n' (换行符)时,将写入输出。
  • Block buffered. 块缓冲。 Output is written when the buffer is full. 缓冲区已满时写入输出。

You can always manually force a write by using fflush . 您始终可以使用fflush手动强制执行写操作。

Files you open (with fopen ) are block buffered by default. 默认情况下,使用fopen打开的文件是块缓冲的。 stderr starts out unbuffered. stderr从无缓冲开始。 stdout is line buffered if it refers to a terminal, and block buffered otherwise. 如果stdout指向终端,则为行缓冲,否则为块缓冲。

Your child processes print full lines ( printf("%d, %d\\n", log[k], k); ). 您的子进程将打印全行( printf("%d, %d\\n", log[k], k); )。 That means as long as stdout goes to a terminal, everything appears immediately (because it's line buffered). 这意味着只要stdout进入终端,一切都会立即出现(因为它是行缓冲的)。

But when you redirect output to a file, stdout becomes block buffered. 但是,当您将输出重定向到文件时, stdout变为块缓冲。 The buffer can be pretty big, so all of your output accumulates in the buffer (it never gets full). 缓冲区可能很大,因此所有输出都累积在缓冲区中(永远不会变满)。 Usually the buffer is also flushed (ie written and emptied) when the FILE handle is closed (with fclose ), and usually all open files are closed automatically when your program ends (by return ing from main or by calling exit ). 通常,在关闭FILE句柄(使用fclose )时,也会刷新(即写入和清空)缓冲区,并且通常在程序结束时(通过从main return或通过调用exit )自动关闭所有打开的文件。

However, in this case, you terminate the process by sending it a (deadly, uncatchable) signal. 但是,在这种情况下,您可以通过发送(严重,无法捕获)信号来终止该过程。 That means your files are never closed and your buffers never written, their contents lost. 这意味着您的文件永远不会关闭,缓冲区也永远不会写入,它们的内容会丢失。 That's why you don't see any output. 这就是为什么您看不到任何输出的原因。


In your second version, you call exit instead of sending yourself a signal. 在第二个版本中,您调用exit而不是向自己发送信号。 This performs the normal cleanup of calling atexit handlers, closing all open files, and flushing their buffers. 这将执行正常的清理工作,包括调用atexit处理程序,关闭所有打开的文件并刷新其缓冲区。


By the way, instead of kill(getpid(), X) you can write raise(X) . 顺便说一句,您可以编写一下kill(getpid(), X)来代替kill(getpid(), X) raise(X) It's shorter and more portable ( raise is standard C). 它更短,更轻便( raise是标准C)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM