简体   繁体   中英

Page Faults and Programming In .Net

I'm reading a book about Operating Systems and it gives some examples in C that I mostly understand. The example I'm looking at now shows two nearly identical pieces of code that will run on a fictitious system...

int i, j;
int [128] [128] data;

for (j = 0; j < 128; j++)
    for (i = 0; i < 128; i++)
        data [i] [j] = 0;

And the second piece of code

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [i] [j] = 0;

On this particular system the first section of code would result in 16k page faults, while the second would result in only 128.

My apologies if this is a silly question, but in my experiences with .NET I've always been largely unaware of memory. I just create a variable and it is 'somewhere' but I don't know where and I don't care.

My question is, how would .NET compare to these C examples in this fictional system (pages are 128 words in size, each row of the array takes one full page. In the first example we set one int on page 1, then one int on page 2, etc....while the second example sets all ints on page 1, then all ints on page 2, etc...)

Also, while I think I understand why the code produces different levels of paging, is there anything meaningful I can do with it? Isn't the size of the page dependent on the operating system? Is the take away that, as a general rule of thumb to access memory in an array as contiguously as possible?

Fictitious operating systems can be useful to demonstrate principles, but no real operating system I know actually works this way. You could only get 16K page faults if the page immediately gets unmapped after use. While it is technically possible for this to happen, the machine would have to be under major load, being desperately out of RAM to support the set of running processes. At that point you just don't care about perf anymore, the machine will have slowed down to a crawl anyway. The technical term for that is " thrashing ".

There's something much more important going on in this code that you didn't mention. The level 1 CPU cache plays a major role in how quickly you can access array elements. A cache line (typically 64 bytes) gets filled from memory on the first access. Accessing the next element, at the next memory address , is very cheap, the data is already in the cache. Accessing the array with the outer index changing first is very expensive, it requires another cache line fill which can take hundreds of cycles, worst case. With the considerable risk of evicting an existing cache line that also contains array data, the 1st level cache is not very big. Getting this right requires knowing how the runtime orders array elements in memory.

Your two samples are identical, but presumably what you want is:

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [i] [j] = 0;  // <--- i,j

And the second piece of code

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [j] [i] = 0;   // <--- j,i

So assuming a page size of 128 * sizeof(int):

  • in one case you are iterating sequentially over the array. In this case the worst case number of page faults is the size of the array in pages = 1.

  • in the other case you are jumping to a new page on each iteration, so in the worst case you may have a page fault on every iteration of the loop.

The situation is exactly the same in .NET.

The number of page faults you get will depend on the number of pages you access which are not in memory.

Pages on the stack are more likely to be in memory unless this is the first time the stack has been used to this degree.

Assuming your page size is 4 KB as it is on many systems, and none of the pages are in memory, you will be accessing 128*128*4 bytes or 64 KB which is 16 pages. Note: it is possible for this structure to be across 17 pages if it doesn't start on a page boundary, in which case the first page will be in memory.

As long as you access every page, it doesn't matter how you access them or in what order.


It could be that you are talking about cache misses, and the order you access the data can make a difference. In this case it is best to access data incrementally along the last index. ie for a[i][j] you want the inner loop to use j However if your cache is large enough, this may not matter. It can make a big different for really large structures which don't fit into the faster caches.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM