简体   繁体   中英

How does fork() process mark parent's PTE's as read only?

I've searched through a lot of resources, but found nothing concrete on the matter:

I know that with some linux systems, a fork() syscall works with copy-on-write; that is, the parent and the child share the same address space, but PTE is now marked read-only , to be used later of COW. when either tries to access a page, a PAGE_FAULT occur and the page is copied to another place, where it can be modified.

However, I cannot understand how the OS reaches the shared PTEs to mark them as "read". I have hypothesized that when a fork() syscall occurs, the OS preforms a "page walk" on the parent's page table and marks them as read-only - but I find no confirmation for this, or any information regarding the process.

Does anyone know how the pages come to be marked as read only? Will appreciate any help. Thanks!

Linux OS implements syscall fork with iterating over all memory ranges ( mmap s, stack and heap) of parent process. Copying of that ranges (VMA - Virtual memory areas is in functioncopy_page_range (mn/memory.c) which has loop over page table entries:

    /*
     * If it's a COW mapping, write protect it both
     * in the parent and the child
     */
    if (is_cow_mapping(vm_flags)) {
        ptep_set_wrprotect(src_mm, addr, src_pte);
        pte = pte_wrprotect(pte);
    }

where is_cow_mapping will be true for private and potentially writable pages (bitfield flags is checked for shared and maywrite bits and should have only maywrite bit set)

#define VM_SHARED   0x00000008
#define VM_MAYWRITE 0x00000020

static inline bool is_cow_mapping(vm_flags_t flags)
{
    return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
}

PUD, PMD, and PTE are described in books like https://www.kernel.org/doc/gorman/html/understand/understand006.html and in articles like LWN 2005: "Four-level page tables merged" .

How fork implementation calls copy_page_range :

  • fork syscall implementation ( sys_fork? or syscall_define0(fork) ) is do_fork (kernel/fork.c) which will call
  • copy_process which will call many copy_* functions , including
  • copy_mm which calls
  • dup_mm to allocate and fill new mm struct, where most work is done by
  • dup_mmap (still kernel/fork.c) which will check what was mmaped and how. (Here I was unable to get exact path to COW implementation so I used the Internet Search Machine with something like "fork+COW+dup_mm" to get hints like [1] or [2] or [3] ). After checking mmap types there is retval = copy_page_range(mm, oldmm, mpnt); line to do real work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM