Using a simulator called MARS 4.5 I am trying to improve the cache performance of this code. This is a sub section of an assembly program that computes prime numbers using the Sieve of Eratosthenes algorithm.
For some reason the sw (store word) has a cache hit rate of 25% where the rest of the program is averaging at about 50% in it's current state. I've tried rearranging some things but I can't figure out what is causing this bottleneck. What needs to be done in order to improve this cache hit rate?
inner: add $t2, $s2, 0 # save the bottom of stack address to $t2
mul $t3, $t1, 4 # calculate the number of bytes to jump over
sub $t2, $t2, $t3 # subtract them from bottom of stack address
add $t2, $t2, 8 # add 2 words - we started counting at 2!
sw $s0, ($t2) # store 1's -> it's not a prime number!
add $t1, $t1, $t0 # do this for every multiple of $t0
bgt $t1, $t9, outer # every multiple done? go back to outer loop
j inner # some multiples left? go back to inner loop
I was able to fix this issue by modifying the program to store bytes instead of words. This increased the number of storage blocks in the cache and thus increased the hit rate.
inner: add $t2, $s2, 0 # save the bottom of stack address to $t2
addi $t3, $t1, 1 # add one byte
sub $t2, $t2, $t3 # subtract them from bottom of stack address
add $t2, $t2, 2 # add 2 bytes - we started counting at 2!
sb $s0, ($t2) # store 1's -> it's not a prime number!
add $t1, $t1, $t0 # do this for every multiple of $t0
bgt $t1, $t9, outer # every multiple done? go back to outer loop
j inner # some multiples left? go back to inner loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.