[英]What Happens When I Call fork() in Unix?
I've tried to look this up, but I'm struggling a bit to understand the relation between the Parent Process and the Child Process immediately after I call fork(). 我试着看一下,但是在调用fork()之后,我正在努力理解父进程和子进程之间的关系。
Are they completely separate processes, only associated by the id/parent id? 它们是完全独立的进程,只与id / parent id相关联吗? Or do they share memory? 或者他们共享记忆? For example the 'code' section of each process - is that duplicated so that each process has it's own identical copy, or is that 'shared' in some way so that only one exists? 例如,每个进程的“代码”部分是重复的,以便每个进程都有自己的相同副本,或者是以某种方式'共享'以便只存在一个?
I hope that makes sense. 我希望这是有道理的。
In the name of full disclosure this is 'homework related'; 以完全披露的名义,这是“与家庭作业相关”; while not a direct question from the book, I have a feeling it's mostly academic and, in practice, I probably don't need to know. 虽然不是书中的直接问题,但我感觉它主要是学术性的,在实践中,我可能不需要知道。
As it appears to the process, the entire memory is duplicated. 在整个过程中,整个内存都是重复的。
In reality, it uses "copy on write" system. 实际上,它使用“写入时复制”系统。 The first time either process changes its memory after fork(), a separate copy is made of the modified page (usually 4kB). 第一次进程在fork()之后更改其内存时,会对已修改的页面(通常为4kB)进行单独的复制。
Usually the code segment of a process is not modified, in which case it remains shared. 通常,流程的代码段不会被修改,在这种情况下,它仍然是共享的。
Logically, a fork creates an identical copy of the original process that is largely independent of the original. 从逻辑上讲,fork会创建原始进程的相同副本,该副本在很大程度上独立于原始进程。 For performance reasons, memory is shared with copy-on-write semantics, which means that unmodified memory (such as code) remains shared. 出于性能原因,内存与copy-on-write语义共享,这意味着未修改的内存(如代码)仍然是共享的。
File descriptors are duplicated, so that the forked process could, in principle, take over a database connection on behalf of the parent (or they could even jointly communicate with the database if the programmer is a bit twisted). 文件描述符是重复的,因此分叉进程原则上可以代表父进程接管数据库连接(或者如果程序员有点扭曲,它们甚至可以与数据库联合通信)。 More commonly, this is used to set up pipes between processes so you can write find -name '*.c' | xargs grep fork
更常见的是,这用于在进程之间设置管道,因此您可以编写find -name '*.c' | xargs grep fork
find -name '*.c' | xargs grep fork
. find -name '*.c' | xargs grep fork
。
A bunch of other stuff is shared. 一堆其他的东西是共享的。 See here for details. 详情请见此处 。
One important omission is threads — the child process only inherits the thread that called fork()
. 一个重要的遗漏是线程 - 子进程只继承调用fork()
的线程。 This causes no end of trouble in multithreaded programs, since the status of mutexes, etc., that were locked in the parent is implementation-specific (and don't forget that malloc()
and printf()
use locks internally). 这导致多线程程序中没有问题,因为锁定在父级中的互斥锁等的状态是特定于实现的(并且不要忘记malloc()
和printf()
内部使用锁)。 The only safe thing to do in the child after fork()
returns is to call execve()
as soon as possible, and even then you have to be cautious with file descriptors. fork()
返回后,子execve()
唯一安全的做法就是尽快调用execve()
,即使这样你也必须对文件描述符保持谨慎。 See here for the full horror story. 在这里看到完整的恐怖故事。
EDIT: typos HTH 编辑:错别字HTH
Yes, they are separate processes, but with some special "properties". 是的,它们是独立的过程,但有一些特殊的“属性”。 One of them is the child-parent relation. 其中之一是孩子与父母的关系。
But more important is the sharing of memory pages in a copy-on-write (COW) manner: until the one of them performs a write (to a global variable or whatever) on a page, the memory pages are shared. 但更重要的是以写时复制(COW)方式共享内存页:直到其中一个执行页面上的写入(全局变量或其他),内存页面被共享。 When a write is performed, a copy of that page is created by the kernel and mapped at the right address. 执行写入时,内核会创建该页面的副本并映射到正确的地址。
The COW magic is done by in the kernel by marking the pages as read-only and using the fault mechanism. 通过在内核中将页面标记为只读并使用故障机制来完成COW魔术。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.