简体   繁体   English

将C代码转换为ARM Cortex M3汇编代码

[英]Convert C-code to ARM Cortex M3 Assembler Code

i have got the following c-function 我有以下C函数

int main_compare (int nbytes, char *pmem1, char *pmem2){
    for(nbytes--; nbytes>=0; nbytes--) {    
        if(*(pmem1+nbytes) - *(pmem2+nbytes) != 0) {
            return 0;
        }
    }
    return 1;
}

and i want to convert it into an ARM - Cortex M3 - assembler code. 我想将其转换为ARM-Cortex M3-汇编代码。 I'm not really good at this, and i don't have a suitable compiler to test if i do it right. 我不是很擅长此事,而且我没有合适的编译器来测试我是否做对了。 But here comes what i have so far 但是我到目前为止所拥有的

byte_cmp_loop PROC
; assuming: r0 = nbytes, r1=pmem1, r2 = pmem2

    SUB R0, R0, #1    ; nBytes - 1 as maximal value for loop counter

_for_loop: 
    ADD R3, R1, R0    ;
    ADD R4, R2, R0    ; calculate pmem + n
    LDRB R3, [R3]     ;
    LDRB R4, [R4]     ; look at this address

    CMP R3, R4        ; if cmp = 0, then jump over return

    BE _next          ; if statement by "branch"-cmd
        MOV R0, #0    ; return value is zero
        BX LR         ; always return 0 here
_next:

    sub R0, R0, #1    ; loop counting
    BLPL _for_loop    ; pl = if positive or zero

    MOV R0, #1        ;
    BX LR             ; always return 1 here

ENDP

but i'm really not sure, if this is right, but i have no idea how to check it.... 但是我真的不确定这是否正确,但是我不知道如何检查...。

I see just 3 fairly simple problems there: 我在那里仅看到3个相当简单的问题:

BE _next          ; if statement by "branch"-cmd
...
sub R0, R0, #1    ; loop counting
BLPL _for_loop    ; pl = if positive or zero
  • BEQ , not BE - condition codes are always 2 letters. BEQ而非BE条件代码始终为2个字母。
  • SUB alone won't update the flags - you need the suffix to say so ie SUBS . SUB本身不会更新标志-您需要使用后缀来表示SUBS
  • BLPL would branch and link, thus overwriting your return address - you want BPL . BLPL将分支和链接,从而覆盖您的BLPL地址-您需要BPL Actually, BLPL wouldn't assemble here anyway, since in Thumb a conditional BL would need an IT to set it up (unless of course your assembler is clever enough to insert one automatically). 实际上, BLPL不会在这里进行汇编,因为在Thumb中,有条件的BL需要IT来设置(除非您的汇编器足够聪明,可以自动插入一个)。

Edit: there's also of course a more general issue with the use of R4 in both the original code and my examples below - if you're interfacing with C code the original value must be preserved across the function call and restored afterwards ( R0 - R3 are designated argument/scratch registers and can be freely modified). 编辑:在原始代码和下面的示例中, R4的使用当然也存在更普遍的问题-如果与C代码接口,则必须在函数调用中保留原始值,然后R0还原( R0 - R3是指定的参数/临时寄存器,可以自由修改)。 If you're in pure assembly however you don't necessarily need to follow a standard calling convention so can be more flexible. 如果您使用的是纯汇编语言,则不必遵循标准的调用约定,因此可以更加灵活。


Now, that's a very literal representation of the C code, and doesn't make best use of the instruction set - in particular the indexed addressing modes. 现在,这是C代码的非常直观的表示,并且没有充分利用指令集-尤其是索引寻址模式。 One of the attractions of assembly programming is having complete control of the instructions, so how can we make it worth our while? 汇编编程的吸引力之一就是可以完全控制指令,那么如何使它值得我们花点时间呢?

First, let's make the C code look a little more like the assembly we want: 首先,让我们使C代码看起来更像我们想要的程序集:

int main_compare (int nbytes, char *pmem1, char *pmem2){
    while(nbytes-- > 0) {    
        if(*pmem1++ != *pmem2++) {
            return 0;
        }
    }
    return 1;
}

Now that that shows our intent more clearly, let's play compiler: 现在,这更清楚地显示了我们的意图,让我们玩一下编译器:

byte_cmp_loop PROC
; assuming: r0 = nbytes, r1=pmem1, r2 = pmem2

_loop:
    SUBS R0, R0, #1   ; Decrement nbytes and set flags based on the result
    BMI  _finished    ; If nbytes is now negative, it was 0, so we're done

    LDRB R3, [R1], #1 ; Load from the address in R1, then add 1 to R1
    LDRB R4, [R2], #1 ; ditto for R2
    CMP R3, R4        ; If they match...
    BEQ _loop         ; then continue round the loop

    MOV R0, #0        ; else give up and return zero
    BX LR

_finished:
    MOV R0, #1        ; Success!
    BX LR
ENDP

And that's nearly 25% fewer instructions! 指令减少了将近25%! Now if we pull in another instruction set feature - conditional execution - and relax the requirements slightly, without breaking C semantics, it gets smaller still: 现在,如果我们引入另一个指令集功能-条件执行-并在不破坏C语义的情况下稍微放宽了要求,则它会变得更小:

byte_cmp_loop PROC
; assuming: r0 = nbytes, r1=pmem1, r2 = pmem2

_loop:
    SUBS R0, R0, #1 ; In C zero is false and any nonzero value is true, so
                    ; when R0 becomes -1 to trigger this branch, we can just
                    ; return that to indicate success
    IT MI           ; Make the following instruction conditional on 'minus'
    BXMI LR

    LDRB R3, [R1], #1
    LDRB R4, [R2], #1
    CMP R3, R4
    BEQ _loop

    MOVS R0, #0     ; Using MOVS rather than MOV to get a 16-bit encoding,
                    ; since updating the flags won't matter at this point
    BX LR
ENDP

assembling to a meagre 22 bytes, that's nearly 40% less code than we started with :D 组装成微薄的22个字节,比我们从:D开始的代码少40%

Well, here is some compiler generated code 好吧,这是一些编译器生成的代码

arm-none-eabi-gcc -O2 -mthumb -c test.c -o test.o
arm-none-eabi-objdump -D test.o

00000000 <main_compare>:
   0:   b510        push    {r4, lr}
   2:   3801        subs    r0, #1
   4:   d502        bpl.n   c <main_compare+0xc>
   6:   e007        b.n 18 <main_compare+0x18>
   8:   3801        subs    r0, #1
   a:   d305        bcc.n   18 <main_compare+0x18>
   c:   5c0c        ldrb    r4, [r1, r0]
   e:   5c13        ldrb    r3, [r2, r0]
  10:   429c        cmp r4, r3
  12:   d0f9        beq.n   8 <main_compare+0x8>
  14:   2000        movs    r0, #0
  16:   e000        b.n 1a <main_compare+0x1a>
  18:   2001        movs    r0, #1
  1a:   bc10        pop {r4}
  1c:   bc02        pop {r1}
  1e:   4708        bx  r1

arm-none-eabi-gcc -O2 -mthumb -mcpu=cortex-m3 -c test.c -o test.o
arm-none-eabi-objdump -D test.o

00000000 <main_compare>:
   0:   3801        subs    r0, #1
   2:   b410        push    {r4}
   4:   d503        bpl.n   e <main_compare+0xe>
   6:   e00a        b.n 1e <main_compare+0x1e>
   8:   f110 30ff   adds.w  r0, r0, #4294967295 ; 0xffffffff
   c:   d307        bcc.n   1e <main_compare+0x1e>
   e:   5c0c        ldrb    r4, [r1, r0]
  10:   5c13        ldrb    r3, [r2, r0]
  12:   429c        cmp r4, r3
  14:   d0f8        beq.n   8 <main_compare+0x8>
  16:   2000        movs    r0, #0
  18:   f85d 4b04   ldr.w   r4, [sp], #4
  1c:   4770        bx  lr
  1e:   2001        movs    r0, #1
  20:   f85d 4b04   ldr.w   r4, [sp], #4
  24:   4770        bx  lr
  26:   bf00        nop

It is funny that the thumb2 extensions dont really seem to make this better, possibly worse. 有趣的是thumb2扩展确实并没有使它变得更好,甚至可能更糟。

If you dont have a compiler does that mean you dont have an assembler and linker either? 如果您没有编译器,是否意味着您也没有汇编器和链接器? I without an assembler and linker it is going to be a lot of work hand compiling and assembling to machine code. 如果没有汇编器和链接器,我将需要大量工作来手工编译和汇编机器代码。 Then how are you going to load this into a processor, etc? 然后,如何将其加载到处理器等中?

if you dont have a cross compiler for arm do you have a compiler at all? 如果您没有arm的交叉编译器,那么您根本没有编译器吗? You need to tell us more about what you do and dont have. 您需要告诉我们有关您做什么和不做什么的更多信息。 If you have a web browser that you used to find stackoverflow and post questions you can probably download the code sourcery tools or https://launchpad.net/gcc-arm-embedded tools and have a compiler, assembler and linker (and dont have to hand convert from c to asm). 如果您拥有用于查找堆栈溢出并发布问题的Web浏览器,则可能可以下载代码源工具或https://launchpad.net/gcc-arm-embedded工具,并具有编译器,汇编器和链接器(并且没有将c转换为asm)。

As far as your code goes the subtract of 1 is correct for the nbytes--, but you failed to compare that nbytes value with zero to see if you dont have to do anything at all. 就您的代码而言,对nbytes减去1是正确的,但是您无法将该nbytes值与零进行比较,以查看是否根本不需要执行任何操作。

in pseudo code 用伪代码

if nbytes >= 0 return 1
nbytes--;
add pmem1+nbytes
load [pmem1+nbytes]
add pmem2+nbytes
load [pmem2+nbytes]
subtract
compare with zero
and so on

you went straight to the nbytes-- without doing the if nbytes>=0; 您直接进入了nbytes -而不是if nbytes> = 0; comparison. 比较。

The assembly for branch if equal is BEQ not BE and BPL instead of BLPL. 如果等于的分支程序集是BEQ而不是BE和BPL而不是BLPL。 So fix those, at the very beginning do an unconditional branch to _next and I think that is it you have it coded. 因此,请解决这些问题,一开始就对_next进行无条件分支,我想就是对它进行了编码。

byte_cmp_loop PROC
; assuming: r0 = nbytes, r1=pmem1, r2 = pmem2

    B _next

_for_loop: 
    ADD R3, R1, R0    ;
    ADD R4, R2, R0    ; calculate pmem + n
    LDRB R3, [R3]     ;
    LDRB R4, [R4]     ; look at this address

    CMP R3, R4        ; if cmp = 0, then jump over return

    BEQ _next          ; if statement by "branch"-cmd
        MOV R0, #0    ; return value is zero
        BX LR         ; always return 0 here
_next:

    sub R0, R0, #1    ; loop counting
    BPL _for_loop    ; pl = if positive or zero

    MOV R0, #1        ;
    BX LR             ; always return 1 here

ENDP

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM