简体   繁体   中英

How to improve cache performance on this MIPS code

Using a simulator called MARS 4.5 I am trying to improve the cache performance of this code. This is a sub section of an assembly program that computes prime numbers using the Sieve of Eratosthenes algorithm.

For some reason the sw (store word) has a cache hit rate of 25% where the rest of the program is averaging at about 50% in it's current state. I've tried rearranging some things but I can't figure out what is causing this bottleneck. What needs to be done in order to improve this cache hit rate?

inner:  add $t2, $s2, 0 # save the bottom of stack address to $t2
        mul $t3, $t1, 4 # calculate the number of bytes to jump over
        sub $t2, $t2, $t3   # subtract them from bottom of stack address
        add $t2, $t2, 8 # add 2 words - we started counting at 2!

        sw  $s0, ($t2)  # store 1's -> it's not a prime number!

        add $t1, $t1, $t0   # do this for every multiple of $t0
        bgt $t1, $t9, outer # every multiple done? go back to outer loop

        j   inner       # some multiples left? go back to inner loop

I was able to fix this issue by modifying the program to store bytes instead of words. This increased the number of storage blocks in the cache and thus increased the hit rate.

inner:  add $t2, $s2, 0 # save the bottom of stack address to $t2
    addi $t3, $t1, 1 # add one byte
    sub $t2, $t2, $t3   # subtract them from bottom of stack address
    add $t2, $t2, 2 # add 2 bytes - we started counting at 2!

    sb  $s0, ($t2)  # store 1's -> it's not a prime number!

    add $t1, $t1, $t0   # do this for every multiple of $t0
    bgt $t1, $t9, outer # every multiple done? go back to outer loop

    j   inner       # some multiples left? go back to inner loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM