简体繁体中英

Read a file, but hint the kernel not to cache its contents?

原文 2015-10-28 11:55:10 0 1 linux/ caching/ linux-kernel/ posix/ system-calls

I noticed I'm getting major problems with system's responsiveness (desktop GUI) after performing a command like this:

cat file_larger_than_ram.bin | ./simple-process

My theory is that this causes the Linux kernel to discard the cache of files it held so far in the unused part of RAM. At the same time, the processes need to access the data they work on, so after performing the command above, they have to load their files again. Given that I'm only going to use file_larger_than_ram.bin once, is there a way to hint kernel not to cache the file? I heard that I could use fadvise for that, but I'm not sure given what fadvise64(2) says:

POSIX_FADV_DONTNEED attempts to free cached pages associated with the specified region. This is useful, for example, while streaming large files. A program may periodically request the kernel to free cached data that has already been used, so that more useful cached pages are not discarded instead.

Would applying fdadvise (input_desc, 0, 0, POSIX_FADV_DONTNEED); actually behave as I expect and solve the problem here?

1 answers

Given that I'm only going to use file_larger_than_ram.bin once, is there a way to hint kernel not to cache the file?

To my knowledge, that is possible with the O_DIRECT flag to the open syscall. But the flag comes with additional limitations (eg file offset and user-space memory buffer alignment) which might cause problems. They didn't cause problems to me in the tests, but the documentation says that the behavior is device/file system specific. Thus I changed my code to use the fadvise() .

(Additionally, I have observed some performance irregularities (the read() / write() were too fast) which suggested that even with O_DIRECT some data sometimes were getting cached. YMMV.)

I heard that I could use fadvise for that, but I'm not sure [...]

It wasn't clear to me either, so I have checked the kernel source code. The effect of fadvise() call with POSIX_FADV_DONTNEED is to remove the corresponding data from the cache. I haven't seen anything to suggest that the flag is sticky and applies to all of the file operations. (Which was why I have checked the source code: I know that Linux performs I/O always through the cache, with the O_DIRECT being the alternative. The sticky POSIX_FADV_DONTNEED didn't fit into the paradigm.)

In other words, to free the cache during reading you need:

keep the track of the file offset before the read()
after the read() , call fadvise(POSIX_FADV_DONTNEED) on the range of data which you have just read.
for best results, you have to read the data in page-aligned blocks. The I/O cache is page based, and the fadvise() maps the specified data range into the list of pages. The misalignment would cause extra read() s (and harm performance) but otherwise is harmless.

For writing it is little bit more complicated: I have observed that fadvise(POSIX_FADV_DONTNEED) has no effect if called right after the write() . One has to call fsync() / fdatasync() to force data being written, thus unpinning the cache entries, and only then call the fadvise(POSIX_FADV_DONTNEED) to free them.

PS As far as I have understood the kernel code, the trick with dd linked to by @AlexHoppus should work. For example cat file; dd if=file of=/dev/null iflag=nocache cat file; dd if=file of=/dev/null iflag=nocache - the cat call would put the file into cache, dd would read it from cache, and then discard it from cache. The fadvise(POSIX_FADV_DONTNEED) operates on the global cache, and thus it is irrelevant who/when read the data, it would discard them anyway.

How to read contents of a directory recursively in Linux Kernel?

Is the kernel not smart enough to drop some of the cache for the blocks of the same file that have already been read?

Read contents of a pdf file

Linux Kernel - Read/Write to a File

2 while loop to read contents of a file

Linux Kernel module read/write into a file

Command to find a function in a file and print its contents

I/O buffer cache for disk file in Linux kernel

batch file to read file contents and iterate

Using fread to read the contents of a file into a structure

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to read contents of a directory recursively in Linux Kernel? Is the kernel not smart enough to drop some of the cache for the blocks of the same file that have already been read? Read contents of a pdf file Linux Kernel - Read/Write to a File 2 while loop to read contents of a file Linux Kernel module read/write into a file Command to find a function in a file and print its contents I/O buffer cache for disk file in Linux kernel batch file to read file contents and iterate Using fread to read the contents of a file into a structure

Related Tags

Read a file, but hint the kernel not to cache its contents?

Question

1 answers

solution1 2 2015-12-10 10:24:49

solution1
2 2015-12-10 10:24:49