简体   繁体   English

ARM Cortex M7 未对齐访问和 memcpy

[英]ARM Cortex M7 unaligned access and memcpy

I am compiling this code for a Cortex M7 using GCC:我正在使用 GCC 为 Cortex M7 编译此代码:

// copy manually
void write_test_plain(uint8_t * ptr, uint32_t value)
{
    *ptr++ = (u8)(value);
    *ptr++ = (u8)(value >> 8);
    *ptr++ = (u8)(value >> 16);
    *ptr++ = (u8)(value >> 24); 
}

// copy using memcpy
void write_test_memcpy(uint8_t * ptr, uint32_t value)
{
    void *px = (void*)&value;
    memcpy(ptr, px, 4);
}

int main(void) 
{
    extern uint8_t data[];
    extern uint32_t value;

    // i added some offsets to data to
    // make sure the compiler cannot
    // assume it's aligned in memory

    write_test_plain(data + 2, value);
    __asm volatile("": : :"memory"); // just to split inlined calls
    write_test_memcpy(data + 5, value);

    ... do something with data ...
}

And I get the following Thumb2 assembly with -O2:我使用 -O2 得到以下 Thumb2 程序集:

// write_test_plain(data + 2, value);
800031c:    2478        movs    r4, #120 ; 0x78
800031e:    2056        movs    r0, #86  ; 0x56
8000320:    2134        movs    r1, #52  ; 0x34
8000322:    2212        movs    r2, #18  ; 0x12
8000324:    759c        strb    r4, [r3, #22]
8000326:    75d8        strb    r0, [r3, #23]
8000328:    7619        strb    r1, [r3, #24]
800032a:    765a        strb    r2, [r3, #25]

// write_test_memcpy(data + 5, value);
800032c:    4ac4        ldr r2, [pc, #784]  ; (8000640 <main+0x3a0>)
800032e:    923b        str r2, [sp, #236]  ; 0xec
8000330:    983b        ldr r0, [sp, #236]  ; 0xec
8000332:    f8c3 0019   str.w   r0, [r3, #25]

Can someone explain how the memcpy version works?有人可以解释一下memcpy版本是如何工作的吗? This looks like inlined 32-bit store to the destination address, but isn't this a problem since data + 5 is most certainly not aligned to a 4-byte boundary?这看起来像是到目标地址的内联 32 位存储,但这不是问题,因为data + 5肯定不会与 4 字节边界对齐吗?

Is this perhaps some optimization which happens due to some undefined behavior in my source?这可能是由于我的源代码中的某些未定义行为而发生的一些优化吗?

For Cortex-M processors unaligned loads and stores of bytes, half-words, and words are usually allowed and most compilers use this when generating code unless they are instructed not to.对于 Cortex-M 处理器,通常允许未对齐的字节、半字和字的加载和存储,并且大多数编译器在生成代码时使用它,除非他们被指示不要这样做。 If you want to prevent gcc from assuming the unaligned accesses are OK, you can use the -mno-unaligned-access compiler flag.如果您想阻止 gcc 假设未对齐的访问正常,您可以使用-mno-unaligned-access编译器标志。

If you specify this flag gcc will no longer inline the call to memcpy and write_test_memcpy looks like如果您指定此标志 gcc 将不再内联对memcpywrite_test_memcpy的调用看起来像

write_test_memcpy(unsigned char*, unsigned long):
  push {lr}
  sub sp, sp, #12
  movs r2, #4
  add r3, sp, #8
  str r1, [r3, #-4]!
  mov r1, r3
  bl memcpy
  add sp, sp, #12
  ldr pc, [sp], #4

Cortex-M 7 , M4, M3 M33, M23 does support unaligned access M0, M+ doesn't support unaligned access Cortex-M 7 , M4, M3 M33, M23 支持非对齐访问 M0, M+ 不支持非对齐访问

however you can disable the support of unaligned access in cortexm7 by setting bit UNALIGN_TRP in configuration and control register and any unaligned access will generate usage fault.但是,您可以通过在配置和控制寄存器中设置位 UNALIGN_TRP 来禁用 cortexm7 中对未对齐访问的支持,任何未对齐的访问都会产生使用错误。

From compiler perspective, default setting is that generated assembly code does unaligned access unless you disable this by using the compile flag -mno-unaligned-access从编译器的角度来看,默认设置是生成的汇编代码进行未对齐访问,除非您使用编译标志-mno-unaligned-access禁用它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM