简体   繁体   English

分解shell脚本; 引擎盖下会发生什么?

[英]Breaking down shell scripts; What happens under the hood?

So, I was given this one line script: 因此,我得到了这一行脚本:

echo test | cat | grep test

Could you please explain to me how exactly that would work given the following system calls: pipe(), fork(), exec() and dup2()? 能否给我解释一下在以下系统调用下该方法将如何工作:pipe(),fork(),exec()和dup2()?

I am looking for an general overview here and mainly the sequence of operations. 我在这里寻找总体概述,主要是操作顺序。 What I know so far is that the shell will fork using fork() and the script's code will replace the shell's one by using the exec(). 到目前为止,我所知道的是外壳将使用fork()进行分叉,而脚本代码将通过使用exec()替换外壳的代码。 But what about pipe and dup2? 但是管道和dup2呢? How do they fall in place? 它们如何落入适当位置?

Thanks in advance. 提前致谢。

First consider a simpler example, such as: 首先考虑一个简单的示例,例如:

echo test | cat

What we want is to execute echo in a separate process, arranging for its standard output to be diverted into the standard input of the process executing cat . 我们想要的是在单独的进程中执行echo ,安排其标准输出转移到执行cat的进程的标准输入中。 Ideally this diversion, once setup, would require no further intervention by the shell — the shell would just calmly wait for both processes to exit. 理想情况下,一旦完成转移,外壳将不需要任何进一步的干预—外壳将平静地等待两个进程退出。

The mechanism to achieve that is called the "pipe". 实现该目标的机制称为“管道”。 It is an interprocess communication device implemented in the kernel and exported to the user-space. 它是在内核中实现并导出到用户空间的进程间通信设备。 Once created by a Unix program, a pipe has the appearance of a pair of file descriptors with the peculiar property that, if you write into one of them, you can read the same data from the other. 一旦由Unix程序创建,管道就具有一对具有特殊属性的文件描述符,如果您写入其中一个,则可以从另一个读取相同的数据。 This is not very useful within the same process, but keep in mind that file descriptors, including but not limited to pipes, are inherited across fork() and even accross exec() . 这在同一过程中不是很有用,但是请记住,文件描述符(包括但不限于管道)是在fork()甚至exec()继承的。 This makes pipe an easy to set up and reasonably efficient IPC mechanism. 这使管道易于设置并且具有相当高效的IPC机制。

The shell creates the pipe, and now owns a set of file descriptors belonging to the pipe, one for reading and one for writing. Shell创建管道,现在拥有一组属于管道的文件描述符,一个用于读取,一个用于写入。 These file descriptors are inherited by both forked subprocesses. 这些文件描述符由两个分支子进程继承。 Now only if echo were writing to the pipe's write-end descriptor instead of to its actual standard output, and if cat were reading from the pipe's read-end descriptor instead of from its standard input, everything would work. 现在,仅当echo将消息写入管道的写入结束描述符而不是其实际标准输出中,并且cat从管道的读取结束描述符而不是从其标准输入中读取时,一切都会正常。 But they don't, and this is where dup2 comes into play. 但是他们没有,这就是dup2发挥作用的地方。

dup2 duplicates a file descriptor as another file descriptor, automatically closing the new descriptor beforehand. dup2将文件描述符复制为另一个文件描述符,并自动自动关闭新描述符。 For example, dup2(1, 15) will close file descriptor 1 (by convention used for the standard output), and reopen it as a copy of file descriptor 15 — meaning that writing to the standard output will in fact be equivalent to writing to file descriptor 15. The same applies to reading: dup2(0, 8) will make reading from file descriptor 0 (the standard input) equivalent to reading from file descriptor 8. If we proceed to close the original file descriptor, the open file (or a pipe) will have been effectively moved from the original descriptor to the new one, much like sci-fi teleports that work by first duplicating a piece of matter at a remote location and then disintegrating the original. 例如, dup2(1, 15)将关闭文件描述符1(按惯例用于标准输出),然后将其重新打开作为文件描述符15的副本-这意味着写入标准输出实际上等同于写入文件描述符15。读取同样如此: dup2(0, 8)将使从文件描述符0(标准输入)的读取等同于从文件描述符8的读取。如果我们继续关闭原始文件描述符,则打开文件(或管道)将被有效地从原始描述符转移到新的描述符,就像科幻传送一样,它首先在远程位置复制一件事物,然后分解原始文件,然后工作。

If you're still following the theory, the order of operations performed by the shell should now be clear: 如果您仍然遵循该理论,那么shell应该执行的操作顺序应该很清楚:

  1. The shell creates a pipe and then fork two processes, both of which will inherit the pipe file descriptors, r and w . Shell创建一个管道,然后fork两个进程,这两个进程都将继承管道文件描述符rw

  2. In the subprocess about to execute echo , the shell calls dup2(1, w); close(w) 在要执行echo的子dup2(1, w); close(w) ,shell调用dup2(1, w); close(w) dup2(1, w); close(w) before exec in order to redirect the standard output to the write end of the pipe. exec之前dup2(1, w); close(w) ,以便将标准输出重定向到管道的写端。

  3. In the subprocess about to execute cat , the shell calls dup2(0, r); close(r) 在要执行cat的子过程中,shell调用dup2(0, r); close(r) dup2(0, r); close(r) in order to redirect the standard input to the read end of the pipe. dup2(0, r); close(r)以便将标准输入重定向到管道的读取端。

  4. After forking, the main shell process must itself close both ends of the pipe. 分叉后,主壳过程必须自己封闭管道的两端。 One reason is to free up resources associated with the pipe once subprocesses exit. 原因之一是一旦子流程退出,就释放与管道相关的资源。 The other is to allow cat to actually terminate — a pipe's reader will receive EOF only after all copies of the write end of the pipe are closed. 另一个是允许cat实际终止-管道的读取器仅在管道的写入端的所有副本都关闭后才能接收EOF。 In steps above, we did close the child's redundant copy of the write end, the file descriptor 15, right after its duplication to 1. But the file descriptor 15 must also exist in the parent, because it was inherited under that number, and can only be closed by the parent. 在上面的步骤中,我们确实关闭了子代写端的冗余副本,即文件描述符15,紧接其复制到1之后。但是文件描述符15也必须存在于父代中,因为它是在该数字下继承的,并且可以仅由父母关闭。 Failing to do that leaves cat 's standard input never reporting EOF, and its cat process hanging as a consequence. 否则, cat的标准输入将永远不会报告EOF,结果其cat过程将挂起。

This mechanism is easily generalized it to three or more processes connected by pipes. 该机制很容易推广到通过管道连接的三个或更多进程。 In case of three processes, the pipes need to arrange that echo 's output writes to cat 's input, and cat 's output writes to grep 's input. 在三个进程的情况下,管道需要安排将echo的输出写入cat的输入,并将cat的输出写入grep的输入。 This requires two calls to pipe() , three calls to fork() , four calls to dup2() and close (one for echo and grep and two for cat ), three calls to exec() , and four additional calls to close() (two for each pipe). 这需要两次调用pipe() ,三个调用fork() ,四个调用dup2()close (一个用于echogrep ,两个用于cat ),三个对exec()调用以及四个对close() exec()附加调用。 close() (每个管道两个)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM