Is the Link Register (LR) affected by inline or naked functions?

Question

I'm using an ARM Cortex-M4 processor. As far as I understand, the LR (link register) stores the return address of the currently executing function. However, do inline and/or naked functions affect it?

I'm working on implementing simple multitasking. I'd like to write some code that saves the execution context (pusing R0 - R12 and LR to the stack) so that it can be restored later. After the context save, I have an SVC so the kernel can schedule another task. When it decide to schedule the current task again, it'd restore the stack and execute BX LR . I'm asking this question because I'd like BX LR to jump to the correct place.

Let's say I use arm-none-eabi-g++ and I'm not concerned with portability.

For example, if I have the following code with the always_inline attribute, since the compiler will inline it, then there is not gonna be a function call in the resulting machine code, so the LR is unaffected, right?

__attribute__((always_inline))
inline void Task::saveContext() {
    asm volatile("PUSH {R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, LR}");
}

Then, there is also the naked attribute whose documentation says that it will not have prologue/epilogue sequences generated by the compiler . What exactly does that mean. Does a naked function still result in a function call and does it affect the LR ?

__attribute__((naked))
void saveContext() {
    asm volatile("PUSH {R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, LR}");
}

Also, out of curiosity, what happens if a function is marked with both always_inline and naked ? Does that make a difference?

Which is the correct way to ensure that a function call does not affect the LR ?

Answer 1

As far as I understand, the LR (link register) stores the return address of the currently executing function.

Nope, lr simply receives the address of the following instruction upon execution of a bl or blx instruction. In the M-class architecture, it also receives a special magic value upon exception entry, which will trigger an exception return when used like a return address, making exception handlers look exactly the same as regular functions.

Once the function has been entered, the compiler is free to save that value elsewhere and use r14 as just another general-purpose register. Indeed, it needs to save the value somewhere if it wants to make any nested calls. With most compilers any non-leaf function will push lr to the stack as part of the prologue (and often take advantage of being able to pop it straight back into pc in the epilogue to return).

Which is the correct way to ensure that a function call does not affect the LR ?

A function call by definition affects lr - otherwise it would be a goto, not a call (tail-calls notwithstanding, of course).

Answer 2

re: update. Leaving my old answer below, since it answers the original question before the edit.

__attribute__((naked)) basically exists so you can write the whole function in asm, inside asm statements instead of in a separate .S file. The compiler doesn't even emit a return instruction, you have to do that yourself. It doesn't make sense to use this for inline functions (like I already answered below).

Calling a naked function will generate the usual call sequence, with a bl my_naked_function , which of course sets LR to point to the instruction after the bl . A naked function is essentially a never-inline function that you write in asm. "prologue" and "epilogue" are the instructions that save and restore callee-saved registers, and the return instruction itself ( bx lr ).

Try it and see. It's easy to look at gcc's asm output. I changed your function names to help explain what's going on, and fixed the syntax (The GNU C __attribute__ extension requires doubled parens).

extern void extfunc(void);

__attribute__((always_inline))
inline void break_the_stack() {   asm volatile("PUSH LR");   }

__attribute__((naked))
void myFunc() {
    asm volatile("PUSH {r3, LR}\n\t"  // keep the stack aligned for our callee by pushing a dummy register along with LR
                 "bl extfunc\n\t"
                 "pop {r3, PC}"
                );
}


int foo_simple(void) {
  extfunc();
  return 0;
}

int foo_using_inline(void) {
  break_the_stack();
  extfunc();
  return 0;
}

asm output with gcc 4.8.2 -O2 for ARM (default is a thumb target, I think).

myFunc():            # I followed the compiler's foo_simple example for this
        PUSH {r3, LR}
        bl extfunc
        pop {r3, PC}
foo_simple():
        push    {r3, lr}
        bl      extfunc()
        movs    r0, #0
        pop     {r3, pc}
foo_using_inline():
        push    {r3, lr}
        PUSH LR
        bl      extfunc()
        movs    r0, #0
        pop     {r3, pc}

The extra push LR means we're popping the wrong data into PC. Maybe another copy of LR, in this case, but we're returning with a modified stack pointer, so the caller will break. Don't mess with LR or the stack in an inline function, unless you're trying to do some kind of binary instrumentation thing.

re: comments: if you just want to set a C variable = LR:

As @Notlikethat points out, LR might not hold the return address. So you might want __builtin_return_address(0) to get the return address of the current function. However, if you're just trying to save register state, then you should save/restore whatever the function has in LR if you hope to correctly resume execution at this point:

#define get_lr(lr_val)  asm ("mov %0, lr" : "=r" (lr_val))

This might need to be volatile to stop it from being hoisted up the call tree during whole-program optimization.

This leads to an extra mov instruction when perhaps the ideal sequence would be to store lr, rather than copy to another reg first. Since ARM uses different instructions for reg-reg move vs. store to memory, you can't just use a rm constraint for the output operand to give the compiler that option.

You could wrap this inside an inline function . A GNU C statement-expression in a macro would also work, but an inline function should be fine:

__attribute__((always_inline)) void* current_lr(void) {  // This should work correctly when inlined, or just use the macro
  void* lr;
  get_lr(lr);
  return lr;
}

For reference: What are SP (stack) and LR in ARM?

A `naked` `always_inline` function is not useful.

The docs say a naked function can only contain asm statements, and only "Basic" asm (without operands, so you have to get args from the right place for the ABI yourself). Inlining that makes zero sense, because you won't know where the compiler put your args.

If you want to inline some asm, don't use a naked function. Instead, use an inline function that uses correct contraints for input/output parameters.

The x86 wiki has some good inline asm links, and they're not all specific to x86. For example, see the collection of GNU inline asm links at the end of this answer for examples of how to make good use of the syntax to let the compiler make as efficient code as possible around your asm fragment.

Is the Link Register (LR) affected by inline or naked functions?

Question

2 answers

solution1
2 2016-03-17 17:08:08

solution2
1 ACCPTED 2016-03-17 16:57:54

re: comments: if you just want to set a C variable = LR:

A `naked` `always_inline` function is not useful.

Is the Link Register (LR) affected by inline or naked functions?

Question

2 answers

solution1 2 2016-03-17 17:08:08

solution2 1 ACCPTED 2016-03-17 16:57:54

re: comments: if you just want to set a C variable = LR:

A naked always_inline function is not useful.

solution1
2 2016-03-17 17:08:08

solution2
1 ACCPTED 2016-03-17 16:57:54

A `naked` `always_inline` function is not useful.