简体   繁体   中英

Why is SP (apparently) stored on exception entry on Cortex-M3?

I am using a TI LM3S811 (a older Cortex-M3) with the SysTick interrupt to trigger at 10Hz. This is the body of the ISR:

void SysTick_Handler(void)
{
    __asm__ volatile("sub r4, r4, #32\r\n");
}

This produces the following assembly with -O0 and -fomit-frame-pointer with gcc-4.9.3 . The STKALIGN bit is 0, so stacks are 4-byte aligned.

00000138 <SysTick_Handler>:
 138:   4668        mov r0, sp
 13a:   f020 0107   bic.w   r1, r0, #7
 13e:   468d        mov sp, r1
 140:   b401        push    {r0}
 142:   f1ad 0420   sub.w   r4, r4, #32
 146:   f85d 0b04   ldr.w   r0, [sp], #4
 14a:   4685        mov sp, r0
 14c:   4770        bx  lr
 14e:   bf00        nop

I don't understand what's going on with r0 in the listing above. Specifically:

1) It seems like we're clearing the lower 3 bits of SP and storing it on the stack. Is that to maintain 8-byte alignment? Or is it something else?

2) Is the exception exit procedure is equally confusing. From my limited understanding of the ARM assembly, it does something like this: SP = SP + 4; R0 = SP;

Followed by storing it back to SP. Which seems to undo the manipulations until this stage.

3) Why is there a nop instruction after the unconditional branch (at 0x14E )?

Since you're using -O0 , you should expect lots of redundant and useless code. The general way in which a compiler works is to generate code with the full generality of everything that might be used anywhere in the program, and then rely on the optimizer to get rid of things that are unneeded.

  1. Yes this is doing 8byte alignment. Its also allocating a stack frame to hold local variables even though you have none.
  2. The exit is the reverse, deallocating the stack frame.
  3. The nop at the end is to maintain 4-byte alignment in the code, as you might want to link with non-thumb code at some point.

If you enable optimization, it will eliminate the stack frame (as its unneeded) and the code will become much simpler.

The ARM Procedure Calling Standard and C ABI expect an 8 byte (64 bit) alignment of the stack. As an interrupt might occur after pushing/poping a single word, it is not guaranteed the stack is correctly aligned on interrupt entry.

The STKALIGN bit, if set (the default) enforces the hardware to align the stack automatically by conditionally pushing an extra (dummy) word onto the stack.

The interrupt attribute on a function tells gcc, OTOH the stack might be missaligned, so it adds this pre-/postamble which enforces the alignment.

So, both actually do the same; one in hardware, one in software. If you can live with a word-aligned stack only, you should remove the interrupt attribute from the function declarations and clear the STKALIGN bit.

Make sure such a "missaligned" stack is no problem (I would not expect any, as this is a pure 32 bit CPU). OTOH, you should leave it as-is, unless you really need to safe that extra conditional(!) clock and word (very unlikely).

Warning: According to the ARM Architecture Reference Manual, setting STKALIGN == 0 is deprecated. Briefly: do not set this bit to 0 !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM