简体繁体 English

虚拟地址如何转换为后备存储上的物理地址？

[英]how is virtual address translated to its physical address on backing store?

原文 2013-07-24 10:02:56 8 4 c/ windows/ operating-system

We have address translation table to translate virtual address (VA) of a process to its corresponding physical address in RAM, but if the table does not have any entry for a VA , it results in page fault and kernal goes to backing store (often a hard drive) and fetch the corresponding data and update the RAM and address translation table. 我们有地址转换表，可将进程的虚拟地址（VA）转换为RAM中相应的物理地址，但是如果该表没有VA的任何条目，则将导致页面错误，并且内核将转到后备存储（通常是硬盘驱动器）并获取相应的数据并更新RAM和地址转换表。 So my question is how does the OS come to know what is the address corresponding to a VA in backing store ? 所以我的问题是，操作系统如何知道后备存储中VA对应的地址是什么？ Does it have a separate translation table for that? 它有一个单独的翻译表吗？

4 个解决方案

A process starts by allocating virtual memory. 一个过程从分配虚拟内存开始。 That eventually will cause a page fault when the program starts actually addressing the virtual memory address. 当程序开始实际寻址虚拟内存地址时，最终将导致页面错误。 The OS knows that the memory access is valid. 操作系统知道内存访问有效。 Since it was allocated explicitly. 由于它是明确分配的。

So no harm done, the OS simply maps the VM address to a physical address. 因此，操作系统不会造成任何危害，只需将VM地址映射到物理地址即可。

If the page fault is for an address that was not previously requested to be a valid VM address then the processor will discover that there is no page table entry for the address. 如果页面错误是针对先前未请求作为有效VM地址的地址，则处理器将发现该地址没有页面表条目。 And will instead raise an GP fault, an AccessViolation or segfault in your program. 而是在程序中引发GP错误，AccessViolation或segfault。 Kaboom, program over. Kaboom，编程结束。

There is no direct correlation, at least not in the way that you suppose. 没有直接的相关性，至少不是您想像的那样。

The operating system divides virtual and phsyical RAM as well as swap space (backing store) and mapped files into pages, most commonly 4096 bytes. 操作系统将虚拟RAM和物理RAM以及交换空间（后备存储）和映射文件划分为页面，通常为4096字节。

When your code accesses a certain address, this is always a virtual address within a page that is either valid-in-core, valid-not-accessed, valid-out-of-core, or invalid. 当您的代码访问某个地址时，该地址始终是页面中的一个虚拟地址，该地址可以是有效内核，有效未访问，有效内核外或无效。 The OS may have other properties (such as "has been written to") in its books, but they're irrelevant for us here. 操作系统在其书籍中可能还具有其他属性（例如“已写入”），但在这里与我们无关。

If the page is in-core, then it has a physical address, otherwise it does not. 如果页面在核心内，则它具有物理地址，否则没有。 When swapped out and in again, the same identical page could in theory very well land in a different physical region of memory. 当换出并再次换入时，从理论上讲，相同的页面可以很好地降落在内存的不同物理区域中。 Similarly, the page after some other page in memory (virtual or physical) could be before that page in the swap file or in a memory-mapped file. 类似地，在存储器（虚拟或物理）其他页面之后的页面可以是在交换文件中或在存储器中的映射文件该页之前。 There's no guarantee for that. 没有任何保证。

Thus, there is no such thing as translating a virtual address to a physical address in backing store. 因此，在后备存储中没有将虚拟地址转换为物理地址的事情。 There is only a translation from a virtual address to a page which may temporarily have a physical address . 从虚拟地址到页面的转换只有一个临时的物理地址 。 In the easiest case, "translating" means dividing by 4096, but of course other schemes are possible. 在最简单的情况下，“翻译”意味着除以4096，但是当然其他方案也是可行的。

Further, every time your code accesses a memory location, the virtual address must be translated to a physical one. 此外，每次您的代码访问一个内存位置时，虚拟地址都必须转换为物理地址。 There exists dedicated logic inside a CPU to do this translation fully automatically (for a very small subset of "hot" pages, often as few as 64), or in a hardware-assisted way, which usually involves a lookup in a more or less complicated hierarchical structure. CPU内部存在专用逻辑，可以完全自动地（对于“热”页面的一小部分，通常只有64个）或以硬件辅助方式（通常涉及或多或少的查找）进行这种转换。复杂的层次结构。
This is also a fault, but it's one that you don't see. 这也是一个错误，但这是您没有看到的。 The only faults that you get to see are the ones when the OS doesn't have a valid page (or can't supply it for some reason), and thus can't assign a physical address to the to-be-translated virtual one. 您看到的唯一错误是操作系统没有有效页面（或由于某种原因而无法提供页面），从而无法将物理地址分配给待翻译的虚拟设备时出现的错误。一。

When your program asks for memory, the OS remembers that certain pages are valid, but they do not exist yet because you have never accessed them. 当您的程序请求内存时，操作系统会记住某些页面有效，但由于您从未访问过它们，因此这些页面尚不存在。

The first time you access a page, a fault happens and obviously its address is nowhere in the translation tables (how could it be, it doesn't exist!). 第一次访问页面时，会发生错误，并且显然其地址在转换表中不存在（怎么可能，它不存在！）。 Thus the OS looks into its books and (assuming the page is valid) it either loads the page from disk or assigns the address of a zero page otherwise. 因此，操作系统会查看其书籍，并且（假设该页面有效）它要么从磁盘加载页面，要么分配零页面的地址。

Not rarely, the OS will cheat and all zero pages are the same write-protected zero page until you actually write to it (at which point, a fault occurs and you are secretly redirected to a different physical memory area, one which you can write to, too. 操作系统会作弊，并且所有零页都是相同的写保护零页，直到您实际对其进行写操作为止（这时会发生错误，并且您会秘密地重定向到另一个物理内存区域，可以在其中写一个也。

Otherwise, that is if you haven't reserved memory, the OS sends a signal (or an equivalent, Windows calls it "exception") which will terminate your process unless handled. 否则，也就是说，如果您没有保留内存，则OS会发送一个信号（或等效的Windows称为“异常”），除非处理，否则该信号将终止您的进程。

For a multitude of reasons, the OS may later decide to remove one or several pages from your working set. 由于多种原因，操作系统可能稍后决定从您的工作集中删除一页或几页。 This normally does not immediately remove them, but maked them candidates for being swapped (for non-mapped data) or discarded (for mapped data) in case more memory is needed. 这通常不会立即删除它们，而是在需要更多内存的情况下使它们成为交换（对于非映射数据）或丢弃（对于映射数据）的候选对象。 When you access an address in one of these pages again, it is either re-added to your working set (likely pushing another one out) or reloaded from disk. 当您再次访问这些页面之一中的地址时，该地址要么重新添加到工作集中（可能将另一个地址推出），要么从磁盘重新加载。

In either case, all the OS needs to know is how to translate your virtual address to a page identifier of some sort (eg "page frame number"), and whether the page is resident (and at what address). 在这两种情况下，所有OS都需要知道如何将您的虚拟地址转换为某种类型的页面标识符（例如“页面框架号”），以及该页面是否驻留（以及位于哪个地址）。

I think your question answer is issue about interrupt table. 我认为您的问题答案是有关中断表的问题。 Page fault is a kind of software interrupt, and operating system must have some solution to that interrupt.And the solution code is already in the os kernel, and that piece of code address is right at the interrupt table.So the page fault happen, os will go to that piece of code to get the unmapped page into the physical memory. 页面错误是一种软件中断，操作系统必须对此中断有某种解决方案。解决方案代码已经在os内核中，并且该代码段就在中断表处。因此发生页面错误，操作系统将转到该段代码，以将未映射的页面放入物理内存。

This is OS-specific, but many implementations share logic with memory-mapped file features (so that anonymous pages actually are memory-mapped views of the pagefile, flagged so that the content can be discards at unmapping instead of flushed). 这是特定于OS的，但是许多实现都使用内存映射文件功能共享逻辑（因此匿名页面实际上是页面文件的内存映射视图，已标记为可以在取消映射时丢弃内容而不是刷新内容）。

For Windows, much of this is documented here, on the CreateFileMapping page 对于Windows，此处大部分内容记录在CreateFileMapping页面上