简体   繁体   English

在 C 中实现流水线。这样做的最佳方法是什么?

[英]Implementing pipelining in C. What would be the best way to do that?

I can't think of any way to implement pipelining in c that would actually work.我想不出任何实际可行的在 c 中实现流水线的方法。 That's why I've decided to write in here.这就是我决定写在这里的原因。 I have to say, that I understand how do pipe/fork/mkfifo work.我不得不说,我理解 pipe/fork/mkfifo 是如何工作的。 I've seen plenty examples of implementing 2-3 pipelines.我见过很多实现 2-3 个管道的例子。 It's easy.这很简单。 My problem starts, when I've got to implement shell, and pipelines count is unknown.我的问题开始了,当我必须实现 shell 时,管道计数未知。

What I've got now: eg.我现在所拥有的:例如。

ls -al | tr a-z A-Z | tr A-Z a-z | tr a-z A-Z

I transform such line into something like that:我把这样的行变成这样的:

array[0] = {"ls", "-al", NULL"}
array[1] = {"tr", "a-z", "A-Z", NULL"}
array[2] = {"tr", "A-Z", "a-z", NULL"}
array[3] = {"tr", "a-z", "A-Z", NULL"}

So I can use所以我可以使用

execvp(array[0],array)

later on.稍后的。

Untli now, I believe everything is OK.直到现在,我相信一切都很好。 Problem starts, when I'm trying to redirect those functions input/output to eachother.问题开始了,当我试图将这些函数的输入/输出重定向到彼此时。

Here's how I'm doing that:这是我的做法:

    mkfifo("queue", 0777);

    for (i = 0; i<= pipelines_count; i++) // eg. if there's 3 pipelines, there's 4 functions to execvp
    {
    int b = fork();             
    if (b == 0) // child
        {           
        int c = fork();

        if (c == 0) 
        // baby (younger than child) 
        // I use c process, to unblock desc_read and desc_writ for b process only
        // nothing executes in here
            {       
            if (i == 0) // 1st pipeline
                {
                int desc_read = open("queue", O_RDONLY);
                // dup2 here, so after closing there's still something that can read from 
                // from desc_read
                dup2(desc_read, 0); 
                close(desc_read);           
                }

            if (i == pipelines_count) // last pipeline
                {
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 0);
                close(desc_write);                              
                }

            if (i > 0 && i < pipelines_count) // pipeline somewhere inside
                {
                int desc_read = open("queue", O_RDONLY);
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 1);
                dup2(desc_read, 0);
                close(desc_write);
                close(desc_read);
                }               
            exit(0); // closing every connection between process c and pipeline             
            }
        else
        // b process here
        // in b process, i execvp commands
        {                       
        if (i == 0) // 1st pipeline (changing stdout only)
            {   
            int desc_write = open("queue", O_WRONLY);               
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            close(desc_write);                  
            }

        if (i == pipelines_count) // last pipeline (changing stdin only)
            {   
            int desc_read = open("queue", O_RDONLY);                                    
            dup2(desc_read, 0); // changing stdin -> pdesc[0]   
            close(desc_read);           
            }

        if (i > 0 && i < pipelines_count) // pipeline somewhere inside
            {               
            int desc_write = open("queue", O_WRONLY);       
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            int desc_read = open("queue", O_RDONLY);                            
            dup2(desc_read, 0); // changing stdin -> pdesc[0]
            close(desc_write);
            close(desc_read);                               
            }

        wait(NULL); // it wait's until, process c is death                      
        execvp(array[0],array);         
        }
        }
    else // parent (waits for 1 sub command to be finished)
        {       
        wait(NULL);
        }       
    }

Thanks.谢谢。

Patryk, why are you using a fifo, and moreover the same fifo for each stage of the pipeline? Patryk,你为什么要使用先进先出,而且管道的每个阶段都使用相同的先进先出?

It seems to me that you need a pipe between each stage.在我看来,每个阶段之间都需要一个管道。 So the flow would be something like:所以流程是这样的:

Shell             ls               tr                tr
-----             ----             ----              ----
pipe(fds);
fork();  
close(fds[0]);    close(fds[1]);
                  dup2(fds[0],0); 
                  pipe(fds);
                  fork();         
                  close(fds[0]);   close(fds[1]);  
                  dup2(fds[1],1);  dup2(fds[0],0);
                  exex(...);       pipe(fds);
                                   fork();     
                                   close(fds[0]);     etc
                                   dup2(fds[1],1);
                                   exex(...);  

The sequence that runs in each forked shell (close, dup2, pipe etc) would seem like a function (taking the name and parameters of the desired process).在每个分叉 shell(close、dup2、pipe 等)中运行的序列看起来像一个函数(获取所需进程的名称和参数)。 Note that up until the exec call in each, a forked copy of the shell is running.请注意,在每个exec调用之前,shell 的分叉副本正在运行。

Edit:编辑:

Patryk:帕特里克:

Also, is my thinking correct? Shall it work like that? (pseudocode): 
start_fork(ls) -> end_fork(ls) -> start_fork(tr) -> end_fork(tr) -> 
start_fork(tr) -> end_fork(tr) 

I'm not sure what you mean by start_fork and end_fork.我不确定你说的 start_fork 和 end_fork 是什么意思。 Are you implying that ls runs to completion before tr starts?你是说lstr开始之前运行完成? This isn't really what is meant by the diagram above.这不是上图的真正含义。 Your shell will not wait for ls to complete before starting tr .在启动tr之前,您的 shell 不会等待ls完成。 It starts all of the processes in the pipe in sequence, setting up stdin and stdout for each one so that the processes are linked together, stdout of ls to stdin of tr ;它按顺序启动管道中的所有进程,为每个进程设置stdinstdout ,以便进程链接在一起, ls stdouttr stdin stdout of tr to stdin of the next tr . trstdin stdout到下一个tr stdin That is what the dup2 calls are doing.这就是 dup2 调用正在做的事情。

The order in which the processes run is determined by the operating system (the scheduler), but clearly if tr runs and reads from an empty stdin it has to wait (to block) until the preceding process writes something to the pipe.进程运行的顺序由操作系统(调度程序)决定,但很明显,如果tr运行并从空的stdin读取,它必须等待(阻塞)直到前一个进程向管道写入内容。 It is quite possible that ls might run to completion before tr even reads from its stdin , but it is equally possible that it wont.很可能ls可能会在tr甚至从其stdin读取之前运行完成,但同样可能不会。 For example if the first command in the chain was something that ran continually and produced output along the way, the second in the pipeline will get scheduled from time to time to prcess whatever the first sends along the pipe.例如,如果链中的第一个命令连续运行并在此过程中产生输出,则管道中的第二个命令将不时被调度以处理第一个沿管道发送的任何命令。

Hope that clarifies things a little :-)希望能澄清一点:-)

It might be worth using libpipeline .可能值得使用libpipeline It takes care of all the effort on your part and you can even include functions in your pipeline.它负责您的所有工作,您甚至可以在管道中包含函数。

The problem is you're trying to do everything at once.问题是你试图一次做所有的事情。 Break it into smaller steps instead.而是将其分解为更小的步骤。

1) Parse your input to get ls -al | 1) 解析您的输入以获取ls -al | out of it.从它。 1a) From this you know you need to create a pipe, move it to stdout, and start ls -al . 1a)由此您知道您需要创建一个管道,将其移动到标准输出,然后启动ls -al Then move the pipe to stdin.然后将管道移动到标准输入。 There's more coming of course, but you don't worry about it in code yet.当然还有更多的东西,但你现在不用在代码中担心它。

2) Parse the next segment to get tr az AZ | 2)解析下一段得到tr az AZ | . . Go back to step 1a as long as your next-to-spawn command's output is being piped somewhere.只要 next-to-spawn 命令的输出被传送到某处,就返回步骤 1a。

Implementing pipelining in C. What would be the best way to do that?在 C 中实现流水线。这样做的最佳方法是什么?

This question is a bit old, but here's an answer that was never provided.这个问题有点老了,但这里有一个从未提供过的答案。 Use libpipeline .使用libpipeline libpipeline is a pipeline manipulation library. libpipeline 是一个管道操作库。 The use case is one of the man page maintainers who had to frequently use a command like the following (and work around associated OS bugs):用例是man页维护者之一,他们必须经常使用如下命令(并解决相关的操作系统错误):

zsoelim < input-file | tbl | nroff -mandoc -Tutf8

Here's the libpipeline way:这是 libpipeline 方式:

pipeline *p;
int status;

p = pipeline_new ();
pipeline_want_infile (p, "input-file");
pipeline_command_args (p, "zsoelim", NULL);
pipeline_command_args (p, "tbl", NULL);
pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
status = pipeline_run (p);

The libpipeline homepage has more examples. libpipeline主页有更多示例。 The library is also included in many distros, including Arch, Debian, Fedora, Linux from Scratch and Ubuntu.该库还包含在许多发行版中,包括 Arch、Debian、Fedora、 Linux from Scratch和 Ubuntu。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM