I am in need of a simple and portable way to explicitly prefetch data. I do not want to use the specific feature of any specific compiler or platform, just something generic enough to work across different platforms and compilers.
One very naive solution that comes to mind is just move a byte/int from the memory location to a register, that "should" bring up that memory segment into the CPU cache to fill a line, at least this is what I logically assume. But maybe it won't be that easy? One possibility is for the compiler to optimize away the operation if that data is not accessed in the particular scope, so no prefetching will occur.
Generally speaking, prefetching and memory loads are not exactly the same operations. There are a few fundamental differences:
So just stick with __builtin_prefetch
and let the compiler do the hard work.
Also, keep in mind that optimizing compilers may generate prefetch instructions automatically. I guess if they do, then you'd have to make sure you do not interfere with that.
Another interesting thing is that, in general, explicit prefetching does not improve performance but slightly degrades it instead. See this LWN article for details and explanation why prefetching was totally removed from the Linux kernel.
Hope it helps. Good Luck!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.