简体   繁体   English

从ARM汇编中的过程返回

[英]Return from a procedure in ARM assembly

When creating a function in ARM assembly, I usually push contents of LR register into r4-r5 at the beggining and after the function has finished I pop r4-r5 to PC : 在ARM汇编中创建函数时,通常在开始时将LR寄存器的内容推入r4-r5 ,并且在函数完成后,将r4-r5弹出到PC

.global myfunc
.type   myfunc, %function

myfunc:
push {r4-r5,lr}
... do stuff...
pop {r4-r5,pc}

However, I have read that using stmfd and ldmfd one might get better performance: 但是,我读到使用stmfdldmfd可能会获得更好的性能:

myfunc:
stmfd sp!,{r4-r11,lr}
...do stuff...
ldmfd sp!,{r4-r11,pc}

What is exactly the sp ? sp到底是什么? I presume it's not really worth saving all the registers r4-r11 in case I'm not actually using them inside myfunc , right? 我认为保存所有寄存器r4-r11真的不值得,以防万一我实际上不在myfunc使用它们,对吗? So the push-pop variant is better in that case? 那么在这种情况下,push-pop变体更好吗?

PUSH {...} is the Thumb equivalent of the ARM instruction STMDB SP!,{...} PUSH {...}是ARM指令STMDB SP!,{...}的Thumb等效项STMDB SP!,{...}

POP {...} is the Thumb equivalent of the ARM instruction LDMIA SP!,{...} POP {...}是ARM指令LDMIA SP!,{...}的Thumb等效项LDMIA SP!,{...}

STM means STore Multiple. STM表示多重存储。
DB means Decrement Before, ie decrement the destination address before each store in this case. DB的意思是递减之前,即在这种情况下递减每个存储之前的目标地址。
IA means Increment After, ie increment the source address after each load in this case. IA表示之后递增,即在这种情况下每次加载后递增源地址。
! means write back the final address to the source/destination address register. 意味着将最终地址写回源/目标地址寄存器。 For example if SP was 0x100 and you did STMDB SP!,{R0-R2} you'd have 0xF4 in SP afterwards. 例如,如果SP为0x100,而您进行了STMDB SP!,{R0-R2} ,则SP中将包含0xF4。
SP is an alias for R13 , and is used as the stack pointer on ARM processors. SPR13的别名,用作ARM处理器上的堆栈指针。

push and pop are pseudo instructions to the assembler they are not real instructions. push和pop是对汇编程序的伪指令,它们不是真正的指令。 You either get a store with the base register updated an stm. 您要么在基址寄存器更新了stm的情况下得到一家商店。

push {r11}
stmdb r13!,{r11}

push {r10-r12}
stmdb r13!,{r10-r12}

I prefer stmdb to stmfd just different syntax for the same instruction. 对于同一条指令,我更喜欢stmdb而不是stmfd。 (stmdb and ldmia make sense to me, decrement before and increment after). (stmdb和ldmia对我来说有意义,在此之前减少,在其之后增加)。

assemble then disassemble. 组装然后拆卸。

   0:   e52db004    push    {fp}        ; (str fp, [sp, #-4]!)
   4:   e92d0800    stmfd   sp!, {fp}
   8:   e92d1c00    push    {sl, fp, ip}
   c:   e92d1c00    push    {sl, fp, ip}

If you look up the stm encoding or even just look at the bits and think about it the upper bits of the instruction 0xe92d are stmia/fd, the lower bits are flags indicating what registers what to be saved, notice at address 4 that is a push of 11, then on 8 and c you have that bit set r11, and then the one below it r10 and the one above it r12. 如果您查找stm编码,甚至只是看一下这些位,并仔细考虑一下,指令0xe92d的高位是stmia / fd,低位是指示要保存什么寄存器的标志,请注意地址4是推入11,然后在8和c上将r11设置为该位,然后将其设置为r10下方,将其设置为r12上方。

push and pop are easier to read than trying to remember to use sp and use the ! 与尝试记住使用sp和使用!相比,push和pop更容易阅读。 after the register and remember the ia/db/fd, etc suffix and all that. 注册后,请记住ia / db / fd等后缀以及所有其他内容。

I believe that thumb might have an actual push/pop. 我相信大拇指可能会有实际的推动/弹出效果。

The single register variant for arm turned into a single store, doesnt matter if you use an stm with one instruction or an str, the operations are functionally equivalent. 用于arm的单个寄存器变体变成单个存储,如果您将stm与一条指令或str一起使用,则操作在功能上是等效的。

So long as you update r13 after the operation and you use db or fd for the stm the you can use the pseudo instruction or the real instructions. 只要在操作后更新r13并将stm使用db或fd,就可以使用伪指令或实数指令。

if you are going to store/restore more than one register then definitely list them in a single instruction, dont make a list of several pushes or pops 如果您要存储/恢复一个以上的寄存器,那么一定要在一条指令中列出它们,不要列出多次推送或弹出操作

no:
push {r10}
push {r11}
push {r12}
yes:
push {r10-r11}

Unless on thumb then you might not have a choice as you can only push r0-r7+r14 and pop r0-r7+r15 to save higher registers you have to copy them down into lower registers then use push. 除非有经验,否则您可能别无选择,因为只能推送r0-r7 + r14和pop r0-r7 + r15来保存较高的寄存器,您必须将它们复制到较低的寄存器中,然后使用push。 and you have to use push the stm wont let you use r13. 并且您必须使用push stm不会让您使用r13。 (thumb2 depending on what extensions are available to your architecture, give you more of an arm-like experience). (thumb2取决于您的体系结构可用的扩展,为您带来更多类似手臂的体验)。

re-reading your question 重新阅读您的问题

sp is r13, the stack pointer. sp是r13,即堆栈指针。 the pseudo instruction chooses the right instructions so you dont need to worry about stm vs str. 伪指令选择正确的指令,因此您无需担心stm vs str。 When you store more than one register you "can" get an optimization on modern arm systems, but not guaranteed. 当您存储多个寄存器时,您可以“对”现代机械臂系统进行优化,但不能保证。 If your amba/axi bus is 64 bits wide it is more than 2 times faster to write 64 bits at a time rather than 32 bits at a time, because on a 64 bit memory system it takes a read-modify-write to do a 32 bit write, but a 64 bit write does not (lets ignore the cache behavior). 如果您的amba / axi总线的宽度为64位,则一次写入64位的速度要比一次写入32位的速度快两倍以上,因为在64位存储系统上,它需要执行读-修改-写操作才能完成。 32位写入,但不进行64位写入(让我们忽略缓存行为)。 If the stm is on an aligned address (when using the stack it would take too much code to figure that out, dont worry about it) then a push of 2 registers would be noticeably faster than two separate pushes (unless the core optimizes those into one bus cycle). 如果stm在一个对齐的地址上(使用堆栈时,要花太多代码才能弄清楚,不用担心),那么2个寄存器的压入将明显快于两个单独的压入(除非内核将这些压入优化为一个公交车周期)。 If you push say 4 registers one of three things happens if unaligned then you get three transfers a 32 bit transfer on the unaligned address (lets say 0x1004), then a 64 bit transfer on the aligned address after that (0x1008), then a 32 bit transfer of the last register (0x1010). 如果推说4寄存器,如果未对齐,则发生三种情况之一,然后您将获得三次传输:在未对齐的地址上进行32位传输(假设为0x1004),然后在对齐的地址上进行64位的传输(0x1008),然后是32位最后一个寄存器(0x1010)的位传输。 If that four register push had been on analigned address then one of two things happens either two separate 64 bit transfers two registers to 0x2010 lets say and two to 0x2018 or a length of 2 transfer (two 64 bit items in an single transfer) at the aligned base address, say 0x2010. 如果那四个寄存器被压入了一个对齐的地址,那么将发生以下两种情况之一:两个单独的64位传输,两个寄存器向0x2010说,两个到0x2018或长度为2的传输(一次传输中有两个64位项)。对齐的基地址,例如0x2010。 You wont get the worst case though which is four individual 32 bit transfers, so it is worth using the stm/push. 您不会遇到最坏的情况,尽管这是四个单独的32位传输,所以值得使用stm / push。

You don't need to push the registers onto the stack if you are not going to use them. 如果您不打算使用寄存器,则无需将其压入堆栈。 Having said that, you will have to see if that adds any real performance benefit. 话虽如此,您将必须查看这是否会增加任何真正的性能优势。 I think, it is simple to push everything, as at later point of time if you or someone modifies the code, it won't accidentally corrupt the registers and the stack. 我认为,推送所有内容都很简单,因为在以后的某个时间点,如果您或某人修改了代码,它不会意外破坏寄存器和堆栈。

By the way you can also do this; 顺便说一下,您也可以这样做; that is, save only r4-r5 using stmfd . 也就是说,使用stmfd仅保存r4-r5。

myfunc:
stmfd sp!,{r4-r5,lr}
...do stuff...
ldmfd sp!,{r4-r5,pc}

OR 要么

myfunc:
stmfd r13!,{r4-r5,r14}
...do stuff...
ldmfd r13!,{r4-r5,pc}

You can make out that sp is alias for r13 and lr is alias for r14 . 您可以确定spr13别名, lrr14别名。 Where, sp stands for stack pointer and lr for link register. 其中, sp代表堆栈指针, lr代表链接寄存器。

SP is the stack pointer register - indicates the top of the current stack. SP是堆栈指针寄存器-指示当前堆栈的顶部。 I believe you only need to use stmfd if you're saving higher registers. 我相信,如果要保存更高的寄存器,则仅需要使用stmfd If you only need to save a couple of lower registers just push & pop. 如果您只需要保存几个较低的寄存器,则按一下并弹出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM