Raspberry Pi 1 B vs. Raspberry Pi 2 B ASM speed difference

Question

I have the following code:

for (short l = j; l < j + input->w_small; l = l + 4){
  add_b = k * input->w_big + l;
  add_s = (k - i) * input->w_small + l - j;

  __asm__ __volatile__(
      "ldr %%r1, [%1];"
      "ldr %%r2, [%2];"
      "usada8 %0, %%r1, %%r2, %0;"
      :"+r" (sad)
      : "r" (input->pic_big + add_b), "r" (input->pic_small + add_s)
      : "r1", "r2"
      );
}

This is part of an image processing algorithm. The application runs 29.24 seconds on RPi 1 B and 7.65 seconds on RPi 2 B resulting in 3.82x speed-up. The question is, why? I understand, that there is an architectural change between the models, but I didn't find any reference regarding USADA8, that it should be significantly faster on ARMv7. Any ideas?

PS: Don't get me wrong, I am perfectly happy with the results, just being curious :)

Answer 1

There may be many reasons, but the main ones are probably (according to this ):

the core frequency is not the same (900MHz for model 2B and 700MHz for model 1B)
The L1 cache of 2B is twice the size og the L1 cache in model 1B (16kB vs 32 kB). I suspect that L2 caches and generally speaking the cache hierarchy is also different.
You might have a different config for both (you can tweak the frequencies of various things)

Raspberry Pi 1 B vs. Raspberry Pi 2 B ASM speed difference

Question

1 answers

solution1
1 2016-02-11 10:26:23

Raspberry Pi 1 B vs. Raspberry Pi 2 B ASM speed difference

Question

1 answers

solution1 1 2016-02-11 10:26:23

solution1
1 2016-02-11 10:26:23