[英]Confusion with dup2(), exec() and pipes
I have been struggling to understand the concept involving the commands dup2()
, exec()
and pipes in conjunction. 我一直在努力理解包含命令dup2()
, exec()
和管道的概念。
The thing I am trying to achieve: 我正在努力实现的目标:
Pipe the output of a program X to the input of a program Y. 将程序X的输出通过管道传递到程序Y的输入。
Something basic like who | sort
基本的东西,例如who | sort
who | sort
with a parent and 2 children, where the children are responsible for executing the programs and the parent passing the programs to the children. 有一个父母和两个孩子,其中孩子负责执行程序,父母将程序传递给孩子。
Here is what I don't understand about pipes: 这是我对管道不了解的内容:
P1) Pipes are treated like files and should be unidirectional, but what prevents me from using one pipe for multiple unidirectional communication channels ? P1)管道就像文件一样对待,并且应该是单向的,但是是什么阻止了我将一个管道用于多个单向通信通道呢? So, let's say I have pipe1
and three processes ( P
- parent - C1
, C2
, children) that have the pipe open by forking. 因此,假设我有pipe1
和三个进程( P
父级C1
, C2
,子级),它们通过分支打开了管道。 All of these processes get to use the file descriptors. 所有这些过程都可以使用文件描述符。 Let's assume we are doing everything correctly, closing the unused pipe ends, P
now writes something to C1
. 假设我们做的一切正确,关闭未使用的管道末端, P
现在向C1
写入内容。 What is the issue with using the pipe for communication between C1
and C2
again? 再次使用管道在C1
和C2
之间进行通信有什么问题? Just while writing this question, an idea hit me: Is there an issue with who reads from it while many processes may have it open simultaneously (two processes are blocking to get a read), ie the system cannot say for sure who wants to read the buffered data written into it? 就在写这个问题的时候,一个主意打了我一个主意:在许多进程可能同时打开它的同时,谁来读它是否存在问题(两个进程正在阻塞以进行读取),即系统无法确定谁想读取它将缓冲的数据写入其中? If so, how is this implemented in the system ? 如果是这样,如何在系统中实现?
I really try to understand the concept, so please bear with me. 我真的很想理解这个概念,所以请多多包涵。
To apply this question into real life here is some pseudocode I am dealing with: 要将这个问题应用到现实生活中,这里是一些我正在处理的伪代码:
P: 病人:
P
closes unneeded read end of pipe1
P
关闭pipe1
不需要的读取端 P
sends program argument ('who') to C1
via pipe1
P
通过pipe1
将程序参数(“谁”)发送到C1
P
closes write end P
关闭写结束 P
waits for children to exit P
等待孩子们离开 C1: C1:
C1
reads the argument from the read end of pipe1
C1
从pipe1
的读取端读取参数 C1
dup2()
s the standard out to the write end of pipe1
C1
dup2()
将标准输出到pipe1
的写端 C1
closes both ends of pipe1
(because we duped it already) C1
关闭pipe1
两端(因为我们已经对其进行了欺骗) C1
execvp()
s the program ('who') C1
execvp()
的程序(“谁”) C2: C2:
C2
dup2()
s read end of pipe1
to stdin
so that it gets input for the program that will be executed C2
dup2()
将pipe1
的读取端读取到stdin
以便它获取将要执行的程序的输入 C2
closes both ends of pipe1
C2
关闭pipe1
两端 C2
waits for input on stdin
of C1
from the dup
ed pipe1
C2
等待输入上stdin
的C1
从dup
ED pipe1
C2
execvp()
s program ('sort') with this input C2
execvp()
的程序(“排序”)与此输入 pipe2
it looks something like this:
但是,如果我引入了另一个管道pipe2
则它看起来像这样:
P
closes both ends of unneeded pipe pipe2
P
关闭不需要的管的两端 pipe2
P
closes unneeded read end of pipe1
P
关闭pipe1
不需要的读取端 P
sends program argument ('who') to C1
via pipe1
P
通过pipe1
将程序参数(“谁”)发送到C1
P
closes write end P
关闭写结束 P
waits for children to exit P
等待孩子们离开 C1: C1:
C1
closes read end of pipe2
C1
关闭pipe2
读取端 C1
reads the argument from the read end of pipe1
C1
从pipe1
的读取端读取参数 C1
dup2()
s the standard out to the write end of pipe2
C1
dup2()
将标准输出到pipe2
的写端 C1
closes write end of pipe2
C1
关闭pipe2
写入端 C1
closes both ends of pipe1
-- with pipe2
, pipe1
redundant in this child C1
关闭了pipe1
两端-对于pipe2
, pipe1
在此子pipe1
多余的 C1
execvp()
s the program ('who') C1
execvp()
的程序(“谁”) C2: C2:
C2
dup2()
s read end of pipe2
to stdin
C2
dup2()
将pipe2
末尾读取到stdin
C2
closes both ends of pipe1
C2
关闭pipe1
两端 C2
waits for input on stdin
of C1
from the dup
ed pipe2
C2
等待输入上stdin
的C1
从dup
ED pipe2
C2
executes program sort
with this input C2
使用此输入执行程序sort
Is the assumption correct that pipes should not be reused in multiple processes because the system may not be sure whom to "serve" ? 这种假设是正确的,因为系统可能不确定要为谁服务,因此不应在多个过程中重用管道? Or is there any other reason for this ? 还是有其他原因吗?
Pipes are primarily designed for one-to-one communication — one writer, one reader. 管道主要是为一对一通信而设计的,即一位作家,一位读者。 While there is nothing that prevents having as many readers and writers as you want, the behavior often makes this not very usable, especially with multiple readers: 虽然没有什么可以阻止想要的读者和作家的数量,但是这种行为通常使它不太实用,尤其是对于多个读者:
tee
command). 如果要广播某些信息,则需要使用其他IPC机制或显式复制数据(例如tee
命令)。 PIPE_BUF
or less are atomic. 唯一的保证是大小为PIPE_BUF
或更小的写操作是原子的。 In the architecture you're describing, you have two independent communication channels: P sends who
to C1, and C1 sends the output of running the who
command to C2. 在您描述的体系结构中,您有两个独立的通信通道:P将who
发送到C1,C1将运行who
命令的输出发送到C2。 In a shell script, that would be something similar to 在shell脚本中,这类似于
echo who | { read command; exec command; } | sort
with echo who
executed in the original process rather than in a subshell. 与echo who
在原始过程而不是在子shell中执行。
Your first proposal doesn't work because there's no way to say that the output of P will go to C1 and the output of C1 will go to C2. 您的第一个建议无效,因为没有办法说P的输出将到达C1,C1的输出将到达C2。 It's still the same pipe, so the output of P could go to C2 and the output of C1 could go back to itself, or it could be a mixture. 它仍然是同一条管道,因此P的输出可以转到C2,而C1的输出可以回到自身,或者可以是混合的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.