简体   繁体   中英

What is L1/L2 cache behavior for LUTs and the alike?

Assuming a LUT of say 512KB of 64-bit double types. Generally speaking, how does the CPU cache the structure in L1 or L2?

For example: I access the middle element, does it attempt to cache the whole LUT or just some of it - say the middle element and then n subsequent elements?

What kind of algorithms does the CPU use to determine what it keeps in L2 cache? Is there a certain look-ahead strategy it follows

Note: I'm assuming x86, but I'd be interested in knowing how other architectures works POWER, SPARC etc..

It depends on the data structure you use for the LUT (look-up table?)

Caches are at their best with things that are laid out contiguously is memory (eg as arrays or std::vectors) rather than scattered around.

In simple terms, when you access a memory location, a block of RAM (a "cache line" worth -- 64 bytes on x86) is loaded into cache, possibly evicting some previously-cached data.

Generally, there are several levels of cache, forming a hierarchy. With each level, access times increase but so does capacity.

Yes, there is lookahead, which is limited by rather simplistic algorithms and the inability to cross page boundaries (a memory page is typically 4KB in size on x86.)

I suggest that you read What Every Programmer Should Know About Memory . It has lots of great info on the subject.

Caches are generally formed as a collection of cache lines. Each cache line's granularity is aligned to the size of the cache line, so, for example, a cache with a cache line of 128 bytes will have the address it is caching data for aligned to 128 bytes.

CPU caches generally use some LRU eviction mechanism (least recently used, as in evict the oldest cache line on a cache miss), as well as having some mapping from a memory address to a particular set of cache lines. (This results in one of the many false sharing errors in x86 if you are trying to read from multiple addresses aligned on a 4k or 16M boundary.)

So, when you have a cache miss, the CPU will read in a cache line of memory that includes the address range missed. If you happen to read across a cache line boundary, that means you will read in two cache lines.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM