Where is the pathname of the file after mmap is called?

Question

char *p = (char*) mmap(...);
....; /* check if p is not -1 */
a = *p;

While running the last statement, a page fault occurs. The fault handler in the kernel will allocate a page in the physical memory and copy 4K bytes from the file into that page, and then modify the page entry. The instruction that reads *p will be executed again, successfully this time.

But how does the fault handler know the file name and path associated with the page. Where is the filename(or the fd ) stored? And the offset in the file.

What if a page in the data segment of a process is swapped out (into a swap file, I guess)? How does the kernel know where to copy from when the page needs to be swapped in later?

Answer 1

The handler doesn't know the file name or the path, because it doesn't use those (you can tell, because even if the file is deleted from the file system after the mapping is created, the mapping continues to work just fine; the file contents remain valid until all open file descriptors and memory mapping are closed).

It doesn't use the fd either; you're allowed to close the fd passed to mmap immediately after the mmap call, and the mapping remains valid (this is in fact necessary on some systems with low ulimit s for open file handles; you can map 10,000 files at once, but you couldn't hold open fd s for all of them if the ulimit for fds was 1000).

What happens is that, at mmap time the virtual memory manager for the OS sets up a bunch of virtual memory tables that basically say "this memory is backed by the following disk sectors". It uses a very similar process when retrieving data that has been written to the swap file and must be read back in. The only differences are in how aggressively memory and disk are synced, whether the mapping to a particular disk sector is static or dynamic (though even for "real" files, the disk sector could change as you run, eg, when writing to a copy-on-write file system), whether the memory must be written (swap) or can simply be dropped ( mmap -ed file w/o dirty pages) under memory pressure, etc.

There are several layers of virtual memory address translation involved that differ by CPU and OS, so the exact mechanics differ, but the basic idea is that after mmap ing, you're bypassing the directory structure and interacting with underlying disk sectors in a way that ignores stuff like names and paths.

Where is the pathname of the file after mmap is called?

Question

1 answers

solution1
0 2016-09-16 18:27:28

Where is the pathname of the file after mmap is called?

Question

1 answers

solution1 0 2016-09-16 18:27:28

solution1
0 2016-09-16 18:27:28