简体   繁体   中英

How does COW after fork work?

I was reading about using COW -approach after doing fork in modern UNIX-like systems.

Suppose we have process — P1. It forks; we get another process — P2. Their virtual memory is backed by the same physical pages because of COW . There is a page where one static global variable(for example, static long variable; outside of main ) is located (in the .data segment) that is backed by physical page A.

Now P1 changes its static global variable; the kernel, after processing the protection fault, maps a new page (page B) to the virtual memory of P1 to store that changed variable.

The same way P2 changes its static global variable, the kernel, after processing the protection fault, maps a new page (page C) to the virtual memory of P1 to store that changed variable.

Now nothing is referencing page A. Where is it located? I suppose it is not "hanging in the air" keeping one physical page out of use, thus wasting memory?

When Page B is created, the COW flag on Page A is removed because the page is no longer shared; there is no longer a need to copy it before modifying it. Therefore, P2 simply uses Page A, possibly without incurring a page fault at all, and certainly without a need to copy the page. Consequently, there is no Page C and Page A is not left unreferenced.

Note that if P1 forks again, or if P2 forks, or both, before modifying the variable on Page A, then there might be 3 or more processes referencing the page. The system usually maintains a reference counter for each page in the memory mapping control information, recording how many processes have the page mapped into their process memory, and that count controls whether the COW flag can be cleared. Until there is just one process referencing the page, the COW flag stays in effect.

An exec operation will decrease the reference count for all the pages in the old process, and that will free the pages for reuse if the reference count goes to zero. If P1 set up some explicitly shared memory, the shared memory pages will not have the COW flag set even though the reference counter can be bigger than 1.

Page A doesn't become dangling, because only one copy takes place.

One of the two processes will trigger the COW first. It will get the new frame B, and the other process sticks with A.

We could arrange for the other process not to get a page fault. That's probably fraught with races, particularly under SMP whereby each core has its own TLB.

Or we can let the other process get a page fault too. It will know that frame A no longer requires copying because, say, there is a ref count in the management object which tracks A, and that ref count has value 1 indicating that A is uniquely mapped. So the page fault handler will just mark the page as present, keeping it mapped to A.

The same thing should happen if a parent spawns a child, that child exits, and then the parent touches a previously shared page. Since it is no longer shared, there is no reason to copy-on-write it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM