简体   繁体   中英

Read a file, but hint the kernel not to cache its contents?

I noticed I'm getting major problems with system's responsiveness (desktop GUI) after performing a command like this:

cat file_larger_than_ram.bin | ./simple-process

My theory is that this causes the Linux kernel to discard the cache of files it held so far in the unused part of RAM. At the same time, the processes need to access the data they work on, so after performing the command above, they have to load their files again. Given that I'm only going to use file_larger_than_ram.bin once, is there a way to hint kernel not to cache the file? I heard that I could use fadvise for that, but I'm not sure given what fadvise64(2) says:

POSIX_FADV_DONTNEED attempts to free cached pages associated with the specified region. This is useful, for example, while streaming large files. A program may periodically request the kernel to free cached data that has already been used, so that more useful cached pages are not discarded instead.

Would applying fdadvise (input_desc, 0, 0, POSIX_FADV_DONTNEED); actually behave as I expect and solve the problem here?

Given that I'm only going to use file_larger_than_ram.bin once, is there a way to hint kernel not to cache the file?

To my knowledge, that is possible with the O_DIRECT flag to the open syscall. But the flag comes with additional limitations (eg file offset and user-space memory buffer alignment) which might cause problems. They didn't cause problems to me in the tests, but the documentation says that the behavior is device/file system specific. Thus I changed my code to use the fadvise() .

(Additionally, I have observed some performance irregularities (the read() / write() were too fast) which suggested that even with O_DIRECT some data sometimes were getting cached. YMMV.)

I heard that I could use fadvise for that, but I'm not sure [...]

It wasn't clear to me either, so I have checked the kernel source code. The effect of fadvise() call with POSIX_FADV_DONTNEED is to remove the corresponding data from the cache. I haven't seen anything to suggest that the flag is sticky and applies to all of the file operations. (Which was why I have checked the source code: I know that Linux performs I/O always through the cache, with the O_DIRECT being the alternative. The sticky POSIX_FADV_DONTNEED didn't fit into the paradigm.)

In other words, to free the cache during reading you need:

  • keep the track of the file offset before the read()

  • after the read() , call fadvise(POSIX_FADV_DONTNEED) on the range of data which you have just read.

  • for best results, you have to read the data in page-aligned blocks. The I/O cache is page based, and the fadvise() maps the specified data range into the list of pages. The misalignment would cause extra read() s (and harm performance) but otherwise is harmless.

For writing it is little bit more complicated: I have observed that fadvise(POSIX_FADV_DONTNEED) has no effect if called right after the write() . One has to call fsync() / fdatasync() to force data being written, thus unpinning the cache entries, and only then call the fadvise(POSIX_FADV_DONTNEED) to free them.

PS As far as I have understood the kernel code, the trick with dd linked to by @AlexHoppus should work. For example cat file; dd if=file of=/dev/null iflag=nocache cat file; dd if=file of=/dev/null iflag=nocache - the cat call would put the file into cache, dd would read it from cache, and then discard it from cache. The fadvise(POSIX_FADV_DONTNEED) operates on the global cache, and thus it is irrelevant who/when read the data, it would discard them anyway.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM