简体   繁体   中英

Comparing performance of ext4 and NTFS

I am doing benchmarking using C of ext4 and NTFS (on RHEL and Windows 7 respectively) as part of a class project.

I am trying to come up with experiments for the benchmarks but am very confused about what should be done and if what I'm thinking makes any sense (my prof is out for 2 weeks and the class has no TA). Here are the details with the questions in bold:

  • Comparing sequential-read speeds: I plan to have a few files of different sizes and performing sequential reads on these with size of each read being a block (4K on both file systems). Should I do this on files of different sizes or on the same file? Also, does it make sense to read in multiples of the block size? Of course, for reads beyond the first caching would come into play.

  • Comparing random-read speeds: Run reads (again of the block size) at different file offsets. Again, same questions as above.

  • Comparing write speeds: Disable write caching on both and write in sizes of a block.

My main concern is with respect to the sizes of the files I run my tests on and the size to use for each operation (read/write).

Any other pointers with respect to what I could include in my experiments and any other changes required to my approach would be greatly appreciated.

I also plan to compare metadata operations but I'm working on those and will probably post a question later if needed.

As benchmarks of filesystems, they will be invalid. There are too many variables. Assuming you'll use the same hardware, your variables are...

  • Filesystem (ext4 vs NTFS)
  • Operating system (Redhat vs Windows)
  • C compiler (gcc/clang vs Visual C++ (I'm guessing))

What you're benchmarking Redhat + their C compiler + ext4 vs Windows + their compiler + NTFS . No general statements about the filesystems can be drawn from this only about the combinations. You might want to point out this flaw in the homework for extra points, or it might just annoy the TA. Your call.

Redhat does have an NTFS implementation, so you could eliminate the variables by benchmarking everything on Redhat, but then you'll be benchmarking Redhat's NTFS implementation vs Redhat's ext4 implementation. This would have no bearing on Windows's NTFS implementation.

It's possible you could conduct extra tests to eliminate the compiler and operating system variables, but I don't know what they are.


Putting that aside, because this is homework, you ask about what scenarios to run. The answer is all of them . Benchmarks are supposed to reflect real world usage, and in the real world you read and write files of different sizes and different caching states. Put them in a big matrix and run all combinations.

Generally, file size ranges should go from the block size all the way up to 4 gigs (simulating a video file). The more increments the better. Powers of 1024 would be a good start, three makes a curve. So 4K, 4M, 4G.

You should benchmark both with and without caching enabled to test how well the filesystem works when reading in an uncached file and when reading a cached file. Both represent real world scenarios.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM