简体   繁体   中英

Understanding shadow paging and how it differs from journaling file systems

I am trying get a good grasp of shadow paging in unix-like file systems; something that you might see in ZFS or WAFL . It seems that in shadow paging, when one wants to make changes to a page, a different page or "shadow page" is written to. Upon completion of the operation(s), ie when everything has been committed, the shadow page is written out, replacing the old page. Is this a correct (albeit high level) understanding of shadow paging?

How does shadow paging then differ from journaling file systems? They seem pretty similar.

Thank you for your time!

Both systems allow you to provide atomicity / consistency via different mechanisms:

  • Shadow paging always allocates a new block when you modify something, and when a block is overwritten its old copy becomes free since there will be no references to it from any other live filesystem blocks. Crash consistency comes through a recursive metadata update up the tree -- you update where the leaf block lives (copied somewhere else during modification), and its parent must be updated (copied somewhere else during modification), etc. The new version of the filesystem with all modifications becomes visible when the entire chain up to the root of the tree has been updated.

  • Journaling allows you to modify blocks in place, but you still have to write them twice: once to the journal marking your intent (and providing multi-update atomicity if needed, such as for implementing moving a file from one directory to another), and then once in the log itself. Since you're modifying in place, for overwrites of the same block you usually don't have to update many other filesystem tree blocks besides the specific ones you overwrote, because the blocks did not change position when you wrote the new version of them.

The biggest difference is that shadow paging / copy on write makes it super easy to implement snapshots in the filesystem -- all you need to do is keep track of an old version of the root of the filesystem tree, and anything it referenced at the time. In journaling this is much harder because any block can be overwritten at any time, and the journal is not infinite -- usually it's overwritten quite quickly because otherwise it would take up a bunch of space on disk.

Probably the biggest downside of copy-on-write, especially for spinning disks, is that it tends to swiss-cheese your data, causing it to become quite fragmented and therefore require more disk seeks during large sequential reads of files that are updated frequently. ZFS has this problem, and I think some later copy-on-write systems worked around this by having some intermediate layer mapping logical block addresses to physical addresses to allow the data to be defragmented.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM