简体   繁体   中英

In case of a process context switch, is the virtual address space (VAS) of the new process loaded into the CPU context (CPU's registers)?

I have read:

Process switching is context switching from one process to a different process. It involves switching out all of the process abstractions and resources in favor of those belonging to a new process. Most notably and expensively, this means switching the memory address space. This includes memory addresses, mappings, page tables, and kernel resources—a relatively expensive operation.

Also:

A context is the contents of a CPU's registers and program counter at any point in time.

Context switching can be described in slightly more detail as the kernel (ie, the core of the operating system) performing the following activities with regard to processes (including threads) on the CPU: (1) suspending the progression of one process and storing the CPU's state (ie, the context) for that process somewhere in memory, (2) retrieving the context of the next process from memory and restoring it in the CPU's registers and (3) returning to the location indicated by the program counter (ie, returning to the line of code at which the process was interrupted) in order to resume the process.

As the VAS is separate for each process and can be of size up to 4GB, is the whole VAS of a process loaded into the CPU context in case of context switch of the process?

Also as each process has separate page table , is the page table also brought into the CPU context in case of a context switch ?

If no, then why is a process context switch slower than a thread context switch (threads share the same VAS )?

As the VAS is separate for each process and can be of size upto 4GB,is the whole VAS of a process is loaded in the CPU context in case of context switch of the process ?

Also as each process has separate page table, does the page table is also bought in the CPU context in case of context switch ?

These questions are related. You swap out one virtual address space for another by changing which set of page tables is performing the virtual -> linear translation. That's how the address space swap is accomplished.

Let's consider a very simple example.

  • Say we have two processes P A and P B . Both processes are executing their program image at virtual address 0x1000.
  • Not visible to the processes are a set of page tables, which map the virtual address space to physical pages of RAM:
    • Pagetable T A maps virtual address 0x1000 to physical address 0x88000
    • Pagetable T B maps virtual 0x1000 to physical 0x99000.
  • Let's say the theoretical CPU has a register called PP (pagetable pointer)

After the processes have been initialized, "swapping the virtual address space" between the two is simple. To load the address space for P A , you simply put the address of T A in PP , and now that process "sees" the memory at 0x88000. And likewise the address of T B for P B , so he will "see" the memory at 0x99000.

When switching between threads (of the same process), the virtual address space does not need changed (because all threads of a given process share the same virtual address space).

Of course there are other things which need swapped in as well (like the CPU registers), but for this discussion, we're only concerned with virtual memory.

On x86 CPUs, the CR3 register is the pointer to the base of the page table hierarchy. It is this register which the OS changes to change address spaces when swapping processes.


Of course, it's more complicated than that. Because the possible virtual address space is so large (4 GiB on x86-32, and 16 exabytes on x86-64), the pagetables themselves would take up a ridiculous amount of space (one entry for every 4 KiB page). To alleviate this, additional levels of indirection are added to the pagetables, which is why I referred to them as a hierarchy . On x86-64, there are 4 levels.

Now imagine if the CPU had to "walk" these paging structures for every virtual-to-physical translation. A single read from virtual memory would require a total of 5 memory accesses! This would be terribly slow.

Enter the Translation lookaside buffer , or TLB. The TLB caches these translations, so a given virtual-to-physical translation only requires the pagetables to be walked once. After that, the TLB remembers the translation, and is much faster. (Of course the TLB can get full, but cache eviction is another story).

So say P A is running, and all of a sudden the kernel swaps in the address space for P B . Now all of those cached virtual-to-physical translations are no longer valid for the new virtual address space! That means we need to flush the TLB, or clear all of its entries out. And because of that, the CPU has to do the slow pagetable walking again, until the TLB cache "heats up" again.

This is why it's considered "expensive" to swap virtual address spaces. Not because it's hard to write to CR3 , but because we trash the TLB every time we do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM